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Abstract 


In this thesis we outline new research in integer factorisation with applications to 
public-key cryptography. In particular, we consider the number field sieve, the newest 
and fastest known method for factorising integers used in public-key cryptosystems. We 
improve so-called polynomial selection methods for the number field sieve. Polynomial 
selection has been a major open problem for the number field sieve since its inception. 
We address the problem by modelling polynomial yield, and giving methods for finding 
polynomials with good yield. The improvements described here were used to obtain a 
new factorisation record, the 140 digit RSA modulus RSA-140, and are being used to 
obtain a further record by factorising a 512 bit RSA modulus RSA-155. 
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The following symbols are used in this thesis without further definition. 


Z, 


the (rational) integers 
the integers modulo n 
the rational numbers 

the real numbers 

the complex numbers 


the set of polynomials in z with integer coefficients 


the number field defined by a, with a € C satisfying f(a) = 0 


for some f € Z[x] of degree d 


the ring of Z-linear combinations of f1,a,... a”) with a 
as above 


the ring of (algebraic) integers of Q(a) 

the Legendre symbol for n mod q 

the exponent of the largest power of p dividing n 
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Chapter 1 


Introduction 


Throughout life, and in particular throughout this thesis, N is a large integer requiring 
factorisation. 

Some integers N require factorisation simply because they’re large and interesting. 
Some require factorisation because, as well as being large and interesting, they’re 
important cryptographically. We are concerned with factorising integers of the latter 
kind. 

The most commonly used public-key cryptosystem is the RSA system. The security 
of RSA relies on certain large N being difficult to factorise. The best measure of the 
level of security offered by RSA is our ability to factorise such N. 

Asymptotically and in practice, the fastest algorithm for factorising these integers 
is the number field sieve. The speed at which the number field sieve factorises N is 
determined by the supply of smooth integers (integers with no large prime factors) 
of a particular form. Given a certain pair of polynomials f1(x), falx) € Z|z] each 
irreducible over Q and of degree d; for i = 1,2, we use the homogeneous polynomials 
F;(x,y) = y* fi(a/y). We search for coprime integer pairs a,b at which both Fi (a,b) 
and F (a,b) are smooth. This search is the rate determining step in factorising N 
using the number field sieve. 

The area in which the number field sieve has had the greatest capacity for im- 
provement is in the selection of these polynomials. “Better” polynomials are ones 
which produce more smooth values. We call the problem of choosing better number 
field sieve polynomials the polynomial selection problem. 

Motivated by both assessing and compromising the security of RSA and similar 
systems, we consider in this thesis the polynomial selection problem for the number 
field sieve. The improvements given here were used to set a new record for factorisation 
of “general” N, by factorising the 140 digit RSA modulus RSA-140. At the time of 
writing, our improvements are also being used to factorise a 512 bit (155 digit) RSA 
modulus RSA-155. 

In this chapter we introduce the polynomial selection problem. In Section 1.1 we 


consider the cryptographic context of the problem. In Section 1.2 we introduce the 
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strategy for factorising integers like RSA moduli. In Section 1.3 we outline very briefły 
the number field sieve, and focus on the polynomial selection problem. We are then in 
a position in Section 1.4 to outline the contribution of this thesis, and in Section 1.5 


to outline the thesis itself. 


1.1 Integer Factorisation and Public-Key Cryptography 


Public-key cryptography [28] is a crucial aspect of modern communication networks. 
Its aim is to ensure that communications over networks are secure. 

Most public-key cryptosystems rely for their security on certain number-theoretic 
problems being intractable. By a large margin, the most commonly used form of 
public-key cryptography is the RSA cryptosystem. RSA, and some other systems like 
it, rely for their security on the problem of integer factorisation. Most other public- 
key cryptosystems in use rely for their security on instances of the discrete logarithm 
problem. Our results also affect some of these systems. 

Next we describe the RSA cryptosystem. We also mention some alternative public- 
key cryptosystems relying on the discrete logarithm problem. Our treatment is very 
brief. A good survey of public-key cryptosystems, from the perspective of their under- 
lying number theoretic problem, is [79]. 


1.1.1 The RSA Public-Key Cryptosystem 


In the RSA public-key cryptosystem [68] the public/private-key pair is generated from 
two distinct large primes p, q of approximately the same size (in fact, they are usually 
the same number of bits). Let N = pg. Choose e € Z coprime to y(N) = (p—1)(q— 1) 


and, using for example the extended Euclidean algorithm, compute 
d=e ! mod y(N). 


The public-key is then the pair (e, N) and the private-key is d. Encryption of the 


message block M occurs by computing 
C= M° mod N. 
Decryption occurs by computing 
Ct = M = M mod N. 


Clearly, factorising N suffices to compromise the security of the system. Also, 
factorising N is equivalent to factorising p(N) (see for example [11]). Whether the 
security of RSA is equivalent to factorising N is an open problem. For the paranoid, 
there are versions of RSA whose security is provably “almost” equivalent to factorising 
N [66], [84], although these methods suffer other disadvantages [73]. 
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For practical purposes, RSA depends for its security on the difficulty of factorising 
N, the RSA modulus. Hence, the level of security provided by the system depends 
on our ability to factorise RSA moduli. Current recommendations of modulus length 
of course depend on the level of security desired. The minimum recommended length 
for financial and government communications requiring a high level of security is 1024 
bits (319 digits). However, 512 bit moduli have been and still are commonly used. 
For example, 512 bits is the default length on certain Internet browsers, and therefore 
such moduli protect a large portion of electronic commerce conducted over the Internet. 
Adi Shamir estimates that 512 bit RSA moduli protect approximately 95 % of Internet 


electronic commerce [70]. 


1.1.2 Other Public-Key Systems 


After integer factorisation, the most common problem on which the security of public- 
key cryptosystems is based is the discrete logarithm problem. Let G be a finite group. 
Without loss of generality we can assume G is cyclic, with generator g. Given a € G, 


the discrete logarithm problem in G is to compute x such that 


c 


g =a. 


Several cryptographic protocols rely on the discrete logarithm problem for their 
security, for example the Diffie/Hellman key exchange protocol [28], ElGamal [31], 
and elliptic curve systems [38]. 


The most desirable groups G for cryptographic purposes are the ones for which 


e the group multiplication law can be implemented efficiently, and 


e the discrete logarithm problem in G is believed to be difficult. 


Two types of group have emerged in practice as satisfying these requirements; the 
multiplicative group of a finite field and the group of points on an elliptic curve over 
a finite field. We denote the first type by GF(q)* for q =p”, p prime, and the second 
E(q). In both cases, q will usually be either p for odd p, or 2” with n > 1. 

There is an analogous version of the number field sieve for factorisation which com- 
putes discrete logarithms in some groups of the first type, GF(q)*. Our improvements 
to the number field sieve for factorisation carry over to this version, and so have an 
impact on the security of systems relying on these instances of the discrete logarithm 
problem. We discuss this after giving more details on the number field sieve. 

It is significant from the point of view of the security of elliptic curve cryptosystems 
that there is no known analogue of the number field sieve addressing the elliptic curve 
discrete logarithm problem. Hence, our improvements do not apply there. Since 
progress on computing discrete logarithms in E(q) lags behind that on factorisation 
and discrete logarithms in GF(q)*, elliptic curve cryptography is emerging as the best 
alternative to RSA. 
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1.2 Factorisation of General Integers 


We focus now on factorisation methods relevant to cryptographic applications. 

Not all integers of a given size are equally difficult to factorise. Some are trivial, 
and some are of a form that makes them susceptible to attack by special methods. In 
particular, the elliptic curve method [50], [10], and the Pollard-rho method [62] find 
“small” factors particularly well. The Pollard p — 1 method [61] succeeds in finding 
factors p for which p — 1 contains no large prime factors. For security, RSA moduli 
need to be integers which are amongst the most difficult to factorise. That is, they 
need to be integers not susceptible to attack because of their particular form. We refer 


to integers with no helpful special form as general integers. RSA moduli, integers 
N = pq 


for primes p and q which are both “large” and not far from VN, lie amongst the 
general integers. 

A family of algorithms has been developed for factorisation of general integers. 
The family is characterised by the factorisation strategy adopted by its members. 
The number field sieve is the newest and best performing member of the family. Its 
immediate predecessor is an algorithm called the multiple polynomial quadratic sieve 
(MPQS) [65], [74]. Several impressive factorisations of RSA moduli were performed 
using MPQS before the number field sieve came into being. By way of background 
to the number field sieve we now explain the factorisation strategy of algorithms in 
the family, concentrating on MPQS and the number field sieve. For a more thorough 


background on factorisation algorithms the reader should refer to, for example, [67]. 


1.2.1 Factorisation by Congruent Squares 


Long before the introduction of the RSA cryptosystem, Fermat posed the forerunner 
to the modern strategy. He noted that if positive integers x and y can be found for 
which 


N = 2? — y? 


then a non-trivial factorisation of N follows immediately as the product (x —y)(z+y). 
The task then is to find z and y, that is, to find the representation of N as a difference 
of two squares. The definite article is used advisedly, every odd N which is a product of 
two prime factors has a unique representation as a difference of two squares. Therein 
lies a problem however; if N is large then finding the unique pair (x,y) becomes 
difficult. 

An improvement usually attributed to Kraithcik [43] partly overcomes this problem. 
His suggestion is this: instead of requiring N to be a difference of two squares, we should 


require only some multiple of N to be a difference of two squares. There are many 
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multiples of N, so there should be many representations for which to search. Hence, 


we now require integers x and y such that 
1? = y? mod N. (1.1) 


That is, we require two congruent squares mod N. The trade-off is that we are no 
longer guaranteed that the representation will produce a non-trivial factorisation of 


N. We are guaranteed that N|(x — y)(x + y), from which we may hope that 


1< gcd(x dE y) <N. (1.2) 


If for example, N = pq and both p and q divide x+y then (1.2) will not hold. However 
it is simple to show that amongst a sample of representations (1.1) we can expect (1.2) 
to hold in at least one half of the cases. Hence we say that given a representation as 
in (1.1), we get a non-trivial factor of N as in (1.2), with probability at least one half. 
If N is an RSA modulus then this probability is exactly one half. In practice, finding 
many representations (1.1) requires only trivially more effort than finding just one, so 
this does not present an obstacle. 

Despite the fact that it is simple, finding two congruent squares mod N becomes 
the strategy for factorising large general integers. Both MPQS and the number field 
sieve adopt this strategy. The question now becomes how to construct the congruent 


squares. 


1.2.2 Congruent Squares from Smooth Polynomial Values 


Congruent squares are constructed from so-called B-smooth integers. 
Definition 1.2.1 An integer is B-smooth if its largest prime factor is at most B. 


If the precise value of B is immaterial we refer to these simply as smooth integers. 

There is a well known procedure for constructing squares from many smooth inte- 
gers. Let r(B) denote the number of primes less than B. Suppose we collect K > 1(B) 
integers which are B-smooth. By some linear algebra modulo 2, we can be guaranteed 
to find a subset of these K integers the product over which is a square. 

How does this work? Let the smooth integers we collect be v; for i = 1,...,K. 
We record the factorisation of each v; over the primes at most B in an exponent vector 
v; of length r(B). The j-th entry of v; is 1 when the j-th prime appears to an odd 
exponent in v;, and 0 otherwise. That is, v; records the square-free portion of v;, with 
ones denoting the “square-free primes” in v;. Now we find a subset S of the v;’s which 
“pairs-up” the square-free primes appearing throughout the primes v; € S. How? We 
form a matrix over Za listing the v; for i = 1,... ,K as rows. Since K > 7(B), the 
number of columns, there is guaranteed to be a linear dependency, modulo 2, amongst 


the rows. The set S now consists precisely of the elements v; corresponding to the 
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rows v; which contribute to the dependency. The product over S is then guaranteed 
to be a square, since all the square-free primes are “paired-up” across the product. 
Hence, from sufficiently many smooth integers, we can construct a square integer after 
some linear algebra modulo 2. In practice, N is very large and therefore so is the 
corresponding matrix (typically there are millions of rows and columns). 

Recall now that, more than just constructing squares, we are required to construct 
congruent squares modulo N. To ensure that this will be the case on collection of 
smooth integers, we require the smooth integers to be of a special form. In particular, 
we require them to be values taken by certain polynomials. For the number field sieve 
the mechanism by which we are guaranteed to get congruent squares from smooth 
polynomial values is explained in Chapter 2. For now it suffices to know that both 
MPQS and the number field sieve proceed by constructing congruent squares modulo 
N, and that this occurs by collection of sufficiently many smooth values of certain 
polynomials. 

In MPQS, we collect smooth values of certain quadratic polynomials. The num- 
ber field sieve is more complicated, because there we collect smooth values of pairs 
of polynomials. For example, often one polynomial will be quintic whilst the other 
is linear. In both MPQS and the number field sieve, the collection of these smooth 
values is overwhelmingly the most time consuming stage of the process. This stage is 
called sieving. Other stages are complicated and time consuming too - for example a 
large matrix requires elimination modulo 2 - but sieving dominates the run-time of the 
algorithms. The polynomial selection problem for the number field sieve involves re- 
ducing the sieving effort required by choosing polynomials which produce many smooth 


values. 


1.2.3 Complexity Estimates 


We now consider the relationship between the collection of smooth polynomial values 
and the run-time of the algorithms. The asymptotic complexity analyses of both 
MPQS and the number field sieve are tied to the appearance of smooth integers in 
the context of each algorithm. The results of these analyses give an intuitive picture 
of where the asymptotic advantage of the number field sieve lies, and what should be 
exploited to best leverage this advantage in practice. 

The following function is central to the analysis. Suppose we have real variables 
v,w with 0 < v < 1. Let the L-function be given by 


L,[v, w] = exp | (w + o(1)) (log z)” (log log ae) | : 


The more important variable is v. Think of the Z-function as interpolating (along v) 
between polynomial and exponential functions of logz. Indeed, Lz[1,w] = getol) 
and L,[0, w] = (log x)”+°), The value of w is not immaterial, but makes a difference 


asymptotically only if v is constant. 
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RSA-100 | Apr 91 
RSA-110 | Apr 92 
RSA-120 | June 93 835 


RSA-129 | Apr 94 5000 
RSA-130 | Apr 96 1000 
RSA-140 | Feb 99 2000 
RSA-155 | Sept 99 ? < 10000 


Table 1.1: RSA Challenge records 


Algorithms in the “congruent squares” family typically have heuristic asymptotic 
run-times described by the L-function. Heuristically, the time taken by MPQS to factor 
N is Ly[1/2,1] as N — oo. In fact, all general factorisation algorithms preceding 
MPQS also have asymptotic run-time at best Ly[1/2,c] for some c > 1. The number 


field sieve however, has asymptotic run-time 
Ly[1/3, (64/9344) . 


The appearance of v = 1/3 is exciting. It brings the number field sieve significantly 
closer to a polynomial-time algorithm than its predecessors. We explore this issue 
in Chapter 2. It is worth noting now that the reason the number field sieve defeats 
other algorithms asymptotically is that the integers it requires to be smooth are much 
smaller asymptotically than those of other algorithms. That is, asymptotically, the 


number field sieve guarantees a much better supply of smooth polynomial values. 


1.2.4 The RSA Factorisation Challenge 


The number field sieve is clearly the state-of-the-art asymptotically, but what is the 
situation in practice? State-of-the-art for general integer factorisation in practice is 
measured by progress through the RSA Factorisation Challenge. The RSA Factorisa- 
tion Challenge is a list of genuine RSA moduli. The Challenge is administered by RSA 
Laboratories [69] precisely to encourage and keep track of factorisation research. 

The challenge numbers begin at length 100 digits, and there is one at every length 
110,120,... ,500 digits. There are also moduli of length 155 digits (512 bits), 232 
digits (768 bits), 309 digits (1024 bits) and 617 digits (2048 bits). 

Table 1.1 shows progress through the list, including the record RSA-140 factorisa- 
tion and the impending RSA-155 record. 

The MIPS-years figures for the effort required to factorise each number are very 
approximate. For RSA-130 and RSA-140 the figures are a little misleading. Lower, but 


still conservative, “could-have-done-it-in” estimates for each are 500 and 1500 MIPS- 
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years respectively. The prediction for RSA-155 is based on extrapolation from the 2000 
MIPS-years figure for RSA-140. We re-visit this estimate in Chapter 6. 

The integers of most interest to us are RSA-129, RSA-130, RSA-140 and RSA- 
155. RSA-129 is not formally part of the RSA Challenge list, it is included here 
for illustration purposes only. It was set as a challenge in the August 1977 edition of 
Scientific American. Accompanying the challenge was the now infamous claim that the 
RSA-129 modulus would be secure for 40 quadrillion years. The RSA-129 factorisation 
is the last record set using MPQS. The RSA-130 record is significant because it is the 
first set using the number field sieve. Notice the decrease in estimated MIPS-years 
effort from RSA-129 to RSA-130. Notice also that the RSA-140 record, even with 
the conservative estimate of effort, was still set with less than half the effort used for 
RSA-129. 

These figures refer only to the effort spent on the sieving stage of each algorithm; 
the stage during which smooth polynomial values are collected. Certainly this is the 
stage that requires the most effort, but particularly with the number field sieve, other 


stages are also complicated and time consuming. 


1.3 The Number Field Sieve Briefly 


In this section we outline very briefly the steps involved in the number field sieve. 
More details are given in Chapter 2. We also note the existence of an analogue of the 
number field sieve for computing discrete logarithms in Z,-1. Finally we focus on the 
polynomial selection problem. 

The number field sieve was developed from ideas of Pollard [63] in 1988. It was 
initially formulated to apply to integers of a special form (for which the polynomial 
selection step is easy). This earlier version is now referred to as the special number 
field sieve. The special number field sieve had early success with the factorisation of 
the ninth Fermat number Fy [47]. Since the polynomial selection step is easy in the 
special number field sieve, that is, an obvious pair of exceptionally good polynomials 


is known in advance for N, its asymptotic run-time is only 
Ly[1/3,(32/9)4] . 


For completeness we note that the largest integer factorised by the special number field 
sieve is 1071! — 1 [17]. 

Our focus is on the polynomial selection step for N where no “special form” poly- 
nomials are available. So we do not consider the special number field sieve any further 
than to say 1t was extended to apply to general N in [14]. Implementations for general 
N emerged soon thereafter ([5], [33], [12]). The RSA-140 and RSA-155 factorisations 


use the implementation of [33], and a variation of [5]. 
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1.3.1 The Algorithm 


Suppose we have two polynomials f1, f2 € Z[xz] which are irreducible over Q and have 
a common root m mod N. Given a; € C for which f;(a;) = 0 for i = 1,2, distinct 
squares are constructed in the number fields Q(a;). Viewed in Zy as images under 
homomorphisms defined by sending each a; ++ m, these squares give rise to (1.1). 

The squares in Q(a;) are constructed from smooth values of the homogeneous 
polynomials F;(x,y) = y% f;(a/y) where d; = deg f;. In fact, coprime integer pairs 
(a,b) at which both F¡ (a,b) and F2(a,b) are smooth are sought. Such a pair is called 
a relation. Relations are identified using a sieving process. Many millions of relations 
are required for interestingly large values of N. Thus, collecting relations (sieving) is 
a very time consuming process. 

Once sufficiently many relations are collected, a large matrix is constructed over 
Zə much as in the procedure outlined at Section 1.2.2. Finding the squares in each 
Q(a;) requires finding a linear dependency over Za amongst the rows of this matrix. 

It is not the squares in Q(a;) that are required for (1.1) however, but the homomor- 
phic images in Zy of their square roots. To find these images, we require the square 
root of each square in Q(a;). Upon finding those square roots, computing the relevant 
gcd will, with probability at least one half, produce a non-trivial factor of N. 


Thus the number field sieve has the following steps: 
1. Polynomial Selection 

2. Sieving 

3. Matrix Reduction 

4. Square Root. 


Steps 2-4 are well studied (which is not to suggest that there is no room for 
improvement). Sieving methods are well developed. That is, we have efficient methods 
for detecting whatever smooth polynomial values are there. The square root step is 
essentially solved. There exists an algorithm for performing the matrix step, although 
very large matrices are becoming a problem from an implementation perspective. 

The polynomial selection step however, has been a major open problem since the 
inception of the number field sieve. The problem is to choose polynomials which ensure 
a good supply of smooth values in practice. The main aim in doing so is to decrease 
sieving times. A pleasant side effect is that we can also, in the presence of other 
techniques (see Chapter 2), reduce the expected matrix size. Before expanding on 
polynomial selection in Section 1.3.3, we note another area affected by improvements 


in polynomial selection. 
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1.3.2 "The Number Field Sieve for Discrete Logarithms 


Algorithms for computing discrete logarithms in G fall into two categories; those which 
work for arbitrary G, and those which rely on properties of particular group represen- 
tations. We do not discuss the former category here, other than to say that the best 
known algorithms in this category have run-times that are exponential in logh, where 
h= |G|. 

All known sub-exponential algorithms fall into the latter category. They are col- 
lectively called index calculus algorithms. A survey of index calculus algorithms for 
discrete logarithm computations is found in [72]. The general strategy of index calculus 
algorithms is to compute the discrete logarithms of many small elements in G, then 
express the desired logarithm as a linear combination of the small ones. The first stage, 
computation of the logarithms of many small elements, is done in a similar manner 
to general factorisation algorithms. That is, collection of sufficiently many “smooth” 
elements followed by reduction of a large matrix. (One difference is that the matrix 
reduction is done modulo l, for each large divisor I of h — 1, rather than modulo 2). 

For the strategy to work in the discrete logarithm case G must have a represen- 
tation which admits a notion of smoothness. Examples relevant to cryptography are 
GF(p)* = Zp-1 for odd p and GF(q)* where q = 2” with n large (say, n > 160). 
In the former case the representation is simply Zp—ı so smoothness is defined as for 
integers. In the latter case the representation is the polynomial ring Zo[1]/(g(x)) for 
some irreducible polynomial g € Zə|x] of degree n. A polynomial in the ring is consid- 
ered smooth if its irreducible factors all have small degree compared to n. There is no 
known sub-exponential attack on elliptic curve discrete logarithms precisely because 
there is no known analogue of “smoothness” for elements in those groups. 

The number field sieve applies to the computation of discrete logarithms in GF (p)*, 
equivalently, in Zp-1. Prior to the emergence of the number field sieve, the best 
known algorithm for discrete logarithms in Zp—ı was the Gaussian Integers method of 
[21]. This method has sub-exponential run-time L,[1/2,1]. Historically, this method 
inspired the number field sieve for factorisation. Things turned in a complete circle 
when Gordon showed in [36] that the number field sieve for factorisation could be 
applied to compute discrete logarithms in Zp_1. Schirokauer’s improvement [71] gives 
the algorithm now referred to as the number field sieve for discrete logarithms in Zp-1, 


with heuristic asymptotic run-time 
Lp[1/3, (64/9)13] . 


The most time consuming stage in the discrete logarithm number field sieve is, 
as with factorisation, the collection of smooth polynomial values. The polynomial 
selection problem for discrete logarithms is similar to that for factorisation. Any im- 


provements for factorisation therefore, should carry over directly to discrete logarithms. 
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Record discrete logarithm calculations in Z,_; lag somewhat behind the corre- 
sponding factorisation record. Weber [80] has implemented Schirokauer’s algorithm. 
Using this implementation the record general discrete logarithm computation using 
the number field sieve is that for an 85 digit p in [81]. A larger record, a 129 digit p, 
was set using the special number field sieve in [82]. The main reason for the lag behind 
factorisation is that the matrix reduction is more difficult for the discrete logarithm 
case than for the factorisation case. As mentioned above, the matrix must be reduced 
modulo some large prime l, not just mod 2. 

Our polynomial selection improvements are yet to be applied to the number field 
sieve for discrete logarithms in Z,-1. We would expect significant improvements once 
this is done. Not only will sieving time be greatly reduced, but as noted above, better 


polynomial selection can reduce the expected matrix size. 


1.3.3 The Polynomial Selection Problem 
So far we have introduced the following: 


e The asymptotic advantage of the number field sieve is that its polynomials guar- 


antee a better supply of smooth values than is the case for previous algorithms. 


e The polynomial selection problem concerns how to exploit this advantage in 
practice. The aim is to choose polynomials which generate many smooth values 


and so reduce the effort required in the time consuming sieving step. 


e For N as large as the values we consider in Chapter 6, the matrix step is also 
troublesome. An advantage of better polynomial selection is that the saving in 
sieving time is sufficient that, in effect, sub-optimal smoothness bounds can be 


chosen to decrease the matrix size. 


There are essentially two known methods for generating suitable polynomial pairs. 
For integers as large as, say, RSA-140, a modified base-m method is the better one. 
With this method, we fix a degree d (for us usually d = 5) then seek m = N 1/(d+1) 
and a polynomial fı of degree d for which 


film) =0 (mod N). (1.3) 


The polynomial fı descends from the base-m representation of N. Indeed, we begin 
with f(x) = > a,x’ where the a; are the coefficients of the base-m representation, 
adjusted so that —m/2 < a; < m/2. 

The alternative polynomial selection method produces two quadratic polynomials 
fi and fo. As a function of the degree of the individual polynomials, this method 
defeats the base-m method. However for sufficiently large integers (like RSA-140) 
the combined degree of two quadratics f; and fo is too low to compete with quintic 


polynomials chosen by the base-m method. 
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In this thesis we examine polynomials produced by both methods. The problem 
demands that we choose “good” polynomials. A polynomial’s “goodness” is determined 
by its yield, that is, the number of smooth values it produces for a given smoothness 
bound and in a given range. We consider the problem in three stages; first we decide 


what to look for, then we decide how to look for it, then we find it. 


1.4 Contribution of the Thesis 


We characterize the contribution of this thesis into three areas. 


1.4.1 Polynomial Yield 


Here we decide what to look for. That is, we develop an understanding of polynomial 
yield. Consider a single polynomial F. We take the yield of F to be influenced by two 
factors, which we call size and root properties. Choosing good F requires choosing F 
with a good combination of size and root properties. 

By size we refer to the magnitude of the values taken by F. It has always been 
well understood that size affects the yield of F. 


Definition 1.4.1 A random value i, is an integer chosen uniformly at random from 
(iEZ:1<i<r]). 


For a fixed smoothness bound the likelihood of a random value i, being smooth de- 
creases rapidly as r — oo, and does so in a well known manner. Hence, previous 
approaches to polynomial selection have sought polynomials whose size is smallest. 

The influence of root properties however has not been either well understood or 
adequately exploited. By root properties we refer to the distribution of the roots of F 
modulo small p* for p prime and k > 1. In short, if F has many roots modulo small 
p”, values of F “behave” as if they are smaller than they actually are. 

We contribute an understanding of this effect by quantifying root properties and 
modelling the interaction between size and root properties to determine polynomial 
yield. Only once this is done can we know what is required for a particular polynomial 


to have “good” yield. 


1.4.2 Polynomial Selection 


Once we understand what to look for, we develop methods for finding it. We seek 
methods for generating polynomials with good combinations of size and root properties. 
We contribute some tricks and techniques which help find such polynomials. As part 
of this process we also contribute techniques for determining, without sieving, the 


“goodness” of a particular polynomial. 
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1.4.3 Polynomials for Record Factorisations 


We used the improvements given in this thesis to select polynomials for the record 
factorisation of RSA-140. We found a decrease by a factor of two in the expected 
sieving time (extrapolated from RSA-130), because of the improved selection. We 
found a decrease in the expected matrix size (that is, the number of rows or columns) 
by a factor of about 1.4, because of the improved selection in the context of other 
procedures (see Chapter 2). We used a polynomial pair whose yield is approximately 
8 times that of a random selection. 

We made further and better use of our techniques for the factorisation of RSA-155. 
At the time of writing, the sieving task for RSA-155 is complete. We used a polynomial 
pair whose yield is approximately 13.5 times that of a random selection. We expect 
the factorisation in August/September of 1999. 

Factorisation of a 512 bit RSA modulus is a significant milestone in integer fac- 
torisation for cryptographic purposes, and effectively renders such moduli useless for 


serious applications. 


1.5 Outline of the Thesis 


Chapter 2 contains mainly background material. In discussing the background material 
we survey the relevant literature. The focus is on aspects most relevant to polynomial 
selection. 

Chapters 3-6 contain the bulk of the research. Chapters 3 and 4 are aimed at 
developing our understanding of polynomial yield. Chapter 3 establishes the framework 
mainly by parameterising root properties. Chapter 4 considers the effect of size and 
root properties together. Initially we examine yield as a function of root properties 
to check their effect and our paramaterisation. We then give a simple method of 
estimating yield. Material contained in Chapters 3 and 4 appeared in [56] and [57]. 

In Chapters 5-6 we use our understanding of yield to address the polynomial se- 
lection problem. Chapter 5 contains techniques for generating good polynomials. The 
focus is on polynomials relevant to factorisation of large RSA moduli. In Chapter 6 
we investigate these techniques, by re-examining the factorisation of RSA-130 and by 
describing the polynomial selection for the RSA-140 and RSA-155 factorisations. Some 
of the material contained in Chapters 5 and 6 appears in [16] and some was delivered 
in [54]. 


Chapter 7 contains conclusions and suggestions for further work. 
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Chapter 1: 


Introduction 


Chapter 2 


Background 


In this chapter we give background material concerning the number field sieve. In 
doing so, we survey the relevant literature. The focus is on issues directly relevant to 
the polynomial selection problem. 

In Sections 2.1 and 2.2 we examine the number field sieve broadly. In Section 2.1 
we focus on the algorithm, its algebraic context, and its practical stages. In Section 
2.2 we examine the asymptotic complexity analysis of the algorithm. In Section 2.3 


we consider the polynomial selection problem specifically. 


2.1 The Number Field Sieve 


We saw a very brief description of the number field sieve in Chapter 1. Here we 
elaborate, concentrating on matters relevant to polynomial selection. 

In Section 2.1.1 we give a more detailed overview of the algorithm than that in 
Chapter 1. We see from this overview that the algorithm lies in a complicated algebraic 
context. The relevance of the smooth polynomial values is clear once placed in this 
context. In Section 2.1.2 we survey the results on this issue. Since we then understand 
why smooth polynomial values are important, we turn in Section 2.1.3 to how they are 
found. That is, we discuss sieving methods. In Section 2.1.4 we consider the matrix 
step. Rather than describe the algorithms used for reduction of the matrix, we focus on 
the benefit received at the matrix stage from better polynomial selection. We illustrate 
with the factorisation of RSA-140. Finally, in Section 2.1.5 we point to the literature 


concerning the square-root stage. 


2.1.1 Outline 


Let fı(x) and fo(x) € Z{x] be irreducible (over Z) polynomials. For ease of exposition, 
for the moment we assume f; and f are monic. We will relax this assumption soon. 


Also, suppose there exists some m € Z for which 
film) = fo(m) = 0 mod N. 
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That is, fı and f2 have a common root m mod N. This requirement is very limiting. 
Finally, suppose that a1, 2 are complex roots of fı and fo respectively, with corre- 
sponding number fields Q(a1) and Q(a2). Think of m as the analogue mod N of each 
a; € C for fi. The key point is that both fı and fg have the same analogue. 

We define ring homomorphisms y1 : Z|aj| > Zy and ga : Zla2] + Zy by sending 


a m mod N and az m mod N. For example 
d-1 d-1 
P1 b 5) = 3 ajm' mod N 
i=0 i=0 


with a; € Z being the coefficients of fı and d the degree of fı. 


Suppose that there exists a set S of coprime integer pairs (a,b) for which both 


I (a—bo1) = B? for some 6, € Zlo1], and 


(a,b)ES 
I (a— baz) = 3% for some bz € Zla]. 
(a,b)ES 
Then 
pte) = [[ (e(a-ba)) = [[ (a—bm) mod N, and 
(a,b)ES (a,b)ES 
2(83) = |] (e2(a—baa)) = [| (a—bm) mod N, 
(a,b)ES (a,b)ES 
giving 


01(81)? = pa(B2)? mod N. 


The key point is that starting with polynomials which have a common root mod N 


ensures that the squares 1 (81)? and g2(82)? are congruent mod N. Now ged(,1(81)+ 
po(Ba), N) will be a non-trivial factor of N with probability at least 1/2, in the sense 
described in Section 1.2.1. 

The question therefore becomes how to construct the set S. It is here that smooth 
polynomial values become relevant. Associated with each polynomial f; is the binary 


homogeneous polynomial 
Fi(z,y) =" fi(2/y)- (2.1) 


Where possible, which is not often, we omit the subscript i. 


Note 2.1.1 Throughout this thesis, we use the upper case F to denote the homoge- 


nous binary version given in (2.1) of the corresponding lower case f € Zlx]. 
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Our set S is constructed by collecting smooth values of the polynomials F;. In 
particular, we collect coprime integer pairs (a,b) at which both F¡(a,b) and F(a, b) 
are B-smooth for some smoothness bound B. We call such an (a,b) pair a relation. In 
fact it suffices if either of F¡ (a,b) or F2(a,b) is almost smooth, as we discuss in Section 
2.1.3. By a simple extension of the linear algebra procedure we saw in Section 1.2.2, 


from enough (a, b) relations, a set S of (a,b) can be found for which each 


I] #0?) (2.2) 


(a,b)ES 


is square in Z. Furthermore, by the construction we explain below, each product (2.2) 


being square in Z makes it practically certain that each 


I (a — bas) (2.3) 


(a,b)ES 


is the required square 3? € Z[a;]. The implication from (2.2) to (2.3) is certainly not 
obvious in advance. 

In practice, a sieving process is used to identify relations. This is the so-called 
sieving stage. Since many smooth values are required, and smooth values are rare, this 
is overwhelmingly the most time consuming stage of the algorithm. Indeed, the time 
taken by sieving dominates the run-time analysis of the entire algorithm. That is why 
good polynomial selection is so crucial to decreasing the time taken to factorise N: 
good polynomial selection increases the number of smooth values produced by F. 

On completion of sieving we find S by reduction modulo 2 of a large sparse matrix. 
Although this does not require as much CPU effort as sieving, the matrix reduction is 
a highly non-trivial process. 

On construction of S we have a product of millions of large algebraic numbers 
(typically a,b are also in the order of millions). The product is a square 6? € Za]. 
We need ;. Hence, a square root of the large algebraic number 6? must be taken. 
This also is a non-trivial exercise. 


Thus, the number field sieve has the following steps. 


1. Polynomial Selection: Select fı and fə (equivalently, F; and F>), with a 


common root mod N to produce many smooth values. 


2. Sieving: Collect relations. That is, find coprime (a,b) at which both Fi (a, b) 
and Fh(a,b) are B-smooth, or almost B-smooth, for some bound B. 


3. Matrix Reduction: Reduce a large sparse matrix over Za to find the required 
set S. 


4. Square Root: Given |[(,,)¢5(a — bai) = B? for some f; € Zla,], find 6; (and 
hence pi(8;)). 
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On completion of Step 4, computing ged(91(81) + 92(2),N) will, with probability 
at least 1/2, give a non-trivial factor of N. Construction of one set S corresponds to 
finding one linear dependency amongst the rows of the large matrix. Finding several 
dependencies costs only trivially more time, so in fact we find sufficiently many sets S 


to make us practically certain of obtaining a non-trivial factor of N at this stage. 


It remains to justify the implication from (2.2) to (2.3). We do this in the next 


subsection. In doing so we also relax some of our simplifying assumptions. 


2.1.2 Congruent Squares from Smooth Polynomial Values 


We assume some familiarity with algebraic number theory. Useful background refer- 
ences are [75], [8] and [18]. 


We have the polynomial f(x) (at this stage we still assume that f is monic) with a 
such that f(a) = 0, the number field Q(a) whose ring of algebraic integers is O, and 
we consider elements a— ba € Q(a). Ultimately we wish to extract enough information 
(by sieving) about the multiplicative structure in Q(a) of each a — ba, to deduce that 


the product over some set of a — ba is square. 


Multiplicative structure in O is difficult to visualise. Moreover, sieving over ele- 
ments in Q(a) is a complicated proposition. Multiplicative structure in Z however, is 
easy to visualise, and sieving over integers is an entirely attractive proposition. We 
make the transition from Q(a) to Z using the norm map. Usually the norm of an 
element © € Q(a) tells something about the multiplicative structure in Q(a) of ¢. It 
turns out that for ¢ € Q(a) of the particular form ¢ = a—ba, the norm tells everything 


about the multiplicative structure in Q(a) of ¢. 


That is, complete information about the ideal factorisation of (a — ba) (the ideal 
generated by a — ba) can be deduced from the integer factorisation of its norm. Now, 
it also turns out that the norm of each element a — ba is given by the value of our 
homogeneous polynomial, F(a,b). Hence, the integer factorisation of F(a,b) (which 
we discover by sieving) gives information on the multiplicative structure of (a — ba), 


and that information suffices in practice to construct the requisite square in Q(a). 


In the remainder of this subsection we elaborate on this argument, essentially giving 
an exposition of the results in [14] and elsewhere. After recalling some basic facts, we 
explain the argument in a simple case. That is, we assume that in Q(a) we have 
Z|a] = O. In general of course possibly Zla] € O. We refer to number fields in which 
the assumption holds as convenient number fields. We then relax the assumption, and 
consider the argument in what we call arbitrary number fields. That requires the use 
of a probabilistic device using quadratic characters. Finally, we relax the assumption 


that f is monic. 
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Convenient Number Fields 


Under the assumption that O = Z|a] we have available in Za] the full theory of unique 


factorisation into prime ideals of ideals in O. Recall the following basic facts. 


e For an element ¢ € Q(a), the norm N of ¢ is given by, 


d 
N(ó) =| [ a0), 
i=1 
where d is the degree of Q(a) (and of f) and the o; are the d embeddings of Q(a) 
in C. The norm N is multiplicative. 
e For an ideal q of O, the norm M of q is given by 
Nq = |O/q|. 


The norm St is multiplicative. Moreover, 


where (¢) denotes the ideal generated by ¢. 


e For every non-trivial prime ideal p of O, we have Np = p? for a unique (rational) 


prime p, and for some postive integer 6. We call 6 the degree of the ideal p. 


e Rational primes p decompose in O as follows; 


g 
(p) = | [př 
1=1 


for positive integers e;. Each e; is called the ramification index of p at p;. With 


6; being the degree of p; we have 


A prime p for which some e; > 1 is called a ramified prime. 
Let ep(¢) = ord,N(¢). We have 


pr" =N@ =n(9 = pe) 
p p 


where p ranges across the prime ideals of O, p ranges across the rational primes, and 
for positive integers vp(¢) (the p-adic valuations of Ç). We are now interested in the 
relationship between e,(¢) and vp(¢) when ¢ is square. Since factorisation into prime 
ideals in O is unique, ¢ is square if and only if every vp(¢) is even. For this to be the 


case it is necessary, but not sufficient, that every ep(¢) is even. If either 
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e the rational prime p is contained in more than one, say two, distinct prime ideals 


p; and pj, or 
e N(p) = p? for some 6 > 1, 


then e,(¢) can be even whilst vy(() is not. 

We avoid the first obstruction by keeping track of appearances of p in N(C) in a 
more particular manner than just ep(6). We will use exponents e, r(C), with possibly 
more than one r for each p, and insist that each of these are even in our purported 
square rather than just each e,(¢) begin even. The second obstruction is avoided by 
the special form of our elements Ç = a — ba. 

The following theorem gives the means for overcoming the first obstruction. It can 
be found in for example [14] and [48], or viewed as a consequence of Theorem 4.8.13 
of [18]. 


Theorem 2.1.2 Let p be a rational prime and let 
R(p) = (r € Zp : f(r) = 0 mod p}. 


Then the first degree prime ideals of O with norm p are in one-to-one correspondence 
with the pairs (p,r) for r € R(p). 


Since we are assuming that Z[a] = O, we can a the first degree primes of Z[a] 
by the pairs (p,r). In fact, for p + (p,r) and ¢ = 3% o aia’ in Z[aj, 


pić © > airt = 0 mod p. (2.4) 
The following theorem, found for example in [14] and [47], overcomes the second 


obstruction. 


Theorem 2.1.3 For coprime integers a and b, every prime ideal of O dividing (a— ba) 


is a first degree prime ideal. 
For the particular elements a — ba we obtain from (2.4) and Theorem 2.1.3 that 
a—baep & a—br=0 mod p. (2.5) 


So, we now have an exact correspondence between the valuations of ideals p © (p,r) 


dividing a — ba and the exponent e,, of p in N(a — ba). That is, 


N(a — ba) md pre(a-ba)) — ] | poze(e-ba), 
p 


with vy(a — ba) = ep (a — ba). Hence, we have the desired correspondence between 
norm factorisations and multiplicative structure in Q(a). 

The following theorem ties the norm values to the values of F (see for example 
183]). 
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Theorem 2.1.4 In Q(a) we have F(a,b) = N(a — ba). 
We can now describe the sieving framework for F. 


Definition 2.1.5 An element Ç € Q(a) is B-smooth exactly when its norm is B- 
smooth. 


Smooth elements a — ba € Zia] can be found by sieving over the polynomial values 
F(a,b) with first degree prime ideals of Z[a] = O, indexed by (p, r), by checking the 
right hand side of (2.5). The factor base consists of all first degree prime ideals of Z[a] 
with norm p < B for some bound B. Given sufficiently many B-smooth a — ba, by 


linear algebra over Zə there exists a set S of (a,b) for which 


y ep, (a — ba) = 0 mod 2 (2.6) 
(a,b)ES 


for all primes p +> (p,r) in the factor base. If O = Zla], then (2.6) is sufficient to 
guarantee that | [ça )es(a — ba) is a square in Z[a]. 


Arbitrary Number Fields 


We now drop the pretence that O = Zia]. We no longer have at our disposal in Za] 
the theory of uniqueness of factorisation into prime ideals in O. Instead we look more 
generally to prime ideals of arbitrary orders of Q(a). Denote Q(a) by K. An order is 
a subring of K which as a Z-module is finitely generated and of rank d = deg K (see 
[18] Section 4.6). Clearly Zlaj is an order. The maximal order O satisfies properties 
not satisfied by arbitrary orders A. In [14] the authors present results which tie the 
ideal structure in arbitrary orders A to the known structure in O. We are interested 
in the case A = Zla], with the aim of re-establishing the connection between norm 
factorisations of F(a,b) and ideal factorisations in A. 

The following result from [14] introduces homomorphisms /,. Think of the /, as 


generalisations, to arbitrary orders, of p-adic valuations in O. 


Theorem 2.1.6 Let A be an order of O. There is, for each prime p of A, a group 
homomorphism ly : K* > Z, such that the following hold: 


1. ly >0 forall À € A, B £0; 
2. if Ç € A and ¢ £0, then ly(ć) > 0 if and only if ¢ € p; 


3. for all Ç € K* we have ly(C) =0 for all but finitely many p, and 


[0e =No], 
p 


where p ranges over the primes of A. 
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The functions lp are deduced from the structure in O as follows. Let q be a prime in 
O lying above the prime p in A (that is, p = AN q). The field O/q is an extension 
of A/p of degree 6 say. Then, informally, lp counts ô appearances of p in A for every 
appearance of q in O. Applying Theorem 2.1.6 to Ć = a — ba gives the following. 


Corollary 2.1.7 Let a and b be coprime integers and let p be a prime of Zia}. If p is 
not a first degree prime then ly(a — ba) =0. Ifp is a first degree prime corresponding 


to the pair (p,r) then ly(a — ba) = ep (a — ba). 


Hence, we have again captured an exact correspondence between the integer fac- 
torisation of N(a — ba) and the ideal factorisation of (a — ba). Now, by linear algebra 


over Z2 we are able to find a set S of coprime pairs (a,b) for which 


y ep, (a — ba) = 0 mod 2. (2.7) 
(a,b)ES 


We saw previously, by uniqueness of factorisation of prime ideals in O, that if O = Z[a] 
then (2.7) is sufficient to guarantee that [[¿, ¿esla — ba) is a square in Z[a]. This is 


not the case in general for the following reasons (from [14]). 


1: Tape s(a—ba)O may not even be a square in O, since we have considered only 


primes of Z[a]. 


2. Even if [Tapyes(a—ba)O is a square in O , it may not be the square of a principal 
ideal in O. 


3. Even if Tapes (a — ba)O is the square of a principal ideal in O, 


II PE s(a — ba) is not necessarily a square generator. 


4. Even if Heapsyes (a — ba) is a generator 6? (for some 8 € O) of a principal square 


ideal, 8 does not necessarily lie in Zia]. 


Obstruction 4 can easily be overcome, since if [](,4)<5(@ — ba) is a square in O 
then 


f(a? T] (ada) =* 


(a,b)ES 
for some w € Zla] . The remaining obstructions are overcome using quadratic charac- 
ters. 
Quadratic Characters 


The use of quadratic characters was first suggested by Adleman [1]. Let V be the 
(multiplicative) group of ¢ € K* for which lp(6) = 0 mod 2 for all primes p of Zał. 
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That is, V contains the elements which, judging by the primes of Za], look like squares 
in K. Not all of them are, so K*? C V where K*? is the multiplicative group of squares 
in K. Now, V/K*? forms a vector space over Zo. In [14] (Theorem 6.7) it is shown 
that there exists a “small” upper bound on the dimension of the vector space. Since 
V is in that sense not too much larger than K*?, it is plausible that there exists a 
probabilistic method by which, given ¢ € V, it is practically certain that in fact the 
element € € K*?. Quadratic characters give us such a method. 

Let q be an odd prime and let s € R(q) be such that the ideal (q, s) does not lie in 


the factor base. Also, let 
a — bs 
Xala — ba) = ( a k 


The essence of using quadratic characters xq is the following. We can be practically 


certain that if ¢ € Z[a] satisfies Xq(6) = 1 for sufficiently many first degree primes q 
not lying above ¢, then ¢ is a square in K. 

In practice, extra columns are annexed to the matrix over Zə whose rows represent 
the relations. Each extra column corresponds to a test prime q. The entry in the row 
corresponding to entry (a,b) and column correpsonding to q is 1 if yg(a + ba) = —1 


and zero otherwise. Hence a linear dependency amongst the rows ensures 


(e=) 1 5- 


(a,b)ES (a,b)ES 


for all test primes q. If sufficiently many q are chosen, [Tape s(a — ba) is almost cer- 
tainly a square in Z[a]. Obstructions 1-3 are overcome in one hit, and the relationship 


in (2.7) captures square appearances of ideals. 


Non-monic Polynomials 


So far we have assumed f and F are monic. Allowing f to be non-monic gives smaller 
coefficients - some of the “size” of the coefficients can be pushed onto the leading coef- 
ficient. For example, the monic base-m method gives m = O(N*/%) and a; = O(N'/4). 
Non-monic base-m polynomials have m = O(N(4)) and a; = O(NY GY). It is 
crucial that we are again able to capture the correspondence between the integer fac- 
torisation of N(a — ba) and the ideal factorisation of a — ba. Allowing non-monic 
polynomials requires some minor adjustments, and we outline these below. 

The significance to this point of f being monic is that a being a root of f guarantees 
that a € O, so Zia] is an order of O. If ag £ +1 that is not necessarily the case. It 
turns out however that A = Zla] N Zlat] is an order of O (see [14]). So Theorem 


2.1.6 is again available. 


Let w be a zero of F(z,ag). If a = w/ag then 


F(w,aa)=0 => „E(a,1)=f(a)=0 
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since F is homogeneous. Now, Z|w] is an order (w € O), and aqla — ba) = aga — bw. 
Also, 


F(a,b) = Naga — bw), 
so we have 
F(a,b) = a¿N(a — ba), 


compared with Theorem 2.1.4. 

Recall that in the monic case we are able to index the first degree primes of Za] 
by pairs (p,r) for r € R(p). If we now identify r with r1/r2 whenever ro Æ 0 then it 
makes sense to consider the set {(r1, r2) € Z : F(r1,r2) = 0 mod p). In fact we define 


R'(p) = +(r1,r2) € ZÉ : F(r1,r2) = 0 mod p} U {oo}, 


and in the case r2 = 0 we identify r € R(p) with co € R'(p). Now, let e, (a, b) 
as before denote the exponent in N(a — ba) corresponding to the ideal (p,r). Then 
Theorem 2.1.6 admits homomorphisms /, (where p ranges across the prime ideals of 
A) for which 


lp (a — ba) if foo 
ly(a — ba) + ordpag if r = œ, 


en aed) = { 


(see [14] and [59]). 

So, with non-monic polynomials, we again capture the correspondence between 
norm and ideal factorisations that gives rise to a practical method of sieving. Thus, 
the significance of smooth polynomial values is that they carry enough information 
on the multiplicative structure in Q(a) of the corresponding elements, to enable the 


construction of squares in Z[a]. 


Aside 2.1.8 For another application which uses this correspondence, see the literature 


regarding practical solution of Thue-Mahler equations [76]. 


Since we now understand how sieving corresponds to deducing multiplicative struc- 


ture in Q(a), we should consider how sieving occurs. 


2.1.3 The Sieving Step 


In this subsection we survey sieving techniques and variations thereof which are relevant 
to the factorisations discussed in Chapter 6. The main sieving techniques are lattice 


sieving and line sieving, and the main variations are large prime variations. 
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Sieving Methods 


The use of sieving techniques in factorisation was first proposed for the quadratic sieve. 
Indeed, this innovation allowed the quadratic sieve to out-perform its competitors at 
the time. The idea is that given p and a polynomial W(x), the integer values of z for 
which p|W (x) are regularly spaced mod p. Start with an array of W(x) values for z in 
some range, and a particular zg at which p|W (zo), the remaining x at which p|W (x) 


satisfy 
Y = Xo mod p. 


Moreover, division of W(x) by p can be mimicked by subtraction from logW (z) of 
logp. So, starting instead with an array of log W(x) values, subtract log p from each 
array entry corresponding to z = zo mod p. After sieving with all p in the factor base, 
array entries which are below some threshold are called candidate relations, and are 
checked for smoothness by actually factorising them. 

In the number field sieve, the (a, b) pairs for which F (a,b) contains an appearance of 
the factor base element (p, r) are regularly spaced mod p. Sieving is therefore available 
as a means of relation collection. As we saw in the previous subsection, if p © (p,r) 


for some r € R'(p) is an ideal in the factor base, then for coprime (a,b) we have 
p|F(a,b) = a-— br = 0 mod p. 


This gives rise to an obvious method of sieving. Start with an array of values log F (a, b). 
Fix b and find the first a = w (in the relevant range) for which ag = br mod p. 
Then subtract log p from the array entries corresponding to ag and to the remaining 
a = ag mod p. Then increment b. This is called classical sieving and was the first 
method suggested for the number field sieve ([14], [63] and [48]). 

John Pollard then suggested the improvement which he called lattice sieving [64]. 
An extension was implemented in [35]. Lattice sieving is substantially faster than 
classical sieving, and has all but replaced it in practice. 

The idea of lattice sieving is as follows. Fix a set Q of primes q for which F has 
at least one root modulo each q. Each q € Q is called a special q. Sieving occurs 
only over those (a,b) for which it is known that q|F (a,b) for some q € Q. If q is not 
too small, then knowing that q|F (a,b) renders it more likely that F (a,b) is smooth. 
Clearly, smooth values of F (a, b) without a divisor in Q will be missed, but Q is chosen 
to ensure that the cost in missed relations is much less than the gain in efficiency. 

We receive a gain in efficiency because it is quick to generate, given a factor base 
element (q, s) with q € Q, the (a,b) pairs for which q|F (a,b) and a/b = s mod p. Such 
pairs form a lattice Lq,s in the (a,b) plane, so can be generated quickly using a reduced 
lattice basis. 

Within Ly. we continue sieving with each prime p < q which occurs in the factor 


base. Sieving with p occurs in one of two ways, by rows or by vectors. Denote by (c, e) 
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the coordinate system in Ly, with respect to its reduced basis. Sieving by rows fixes e 
and, analogously to classical sieving, sieves the factor base elements (p, r) with p < q 
along e. Sieving by vectors regards pairs in the (c, e) plane corresponding to (a, b) pairs 
for which p|F (a,b) with a/b = r mod p, as a sub-lattice (abbreviated to Lyp) of Ly. A 
basis for Ly is not always well-defined. If not, then this (p, r) is sieved by rows. If so, 
then L, is generated from a reduced basis. More complete details of these processes 
are to be found in [35]. 

Line sieving (see [33] for details) is similar to lattice sieving by rows. It corresponds 
to lattice sieving with fixed b. That is, for each special q, fix b then perform lattice 
sieving on all (a,b) for which q|F (a,b), then increment b. Incrementing b is a com- 
paratively expensive operation. Typically, polynomials generated for use with the line 
siever are re-written so that b is not changed often. Indeed, often b is fixed at b = 1. 
This is made more efficient by the use of “skewed” bases for the relevant sub-lattices 
in the (a,b) plane. 

For the most part in this thesis we use the number field sieve implementation 
described in [33]. In Chapter 6 we refer to the siever in this implementation as the 
CWI siever. We also report in Chapter 6 on some sieving performed using the lattice 
siever of [35], adjusted to use the “skewed” basis representations mentioned above. We 


refer in Chapter 6 to this adjusted lattice siever as the AKL siever. 


Remark 2.1.9 To this point we have spoken only of algorithmic, or “software” con- 
siderations in sieving. Adi Shamir recently proposed a hardware variation [70]. He 
proposes an opto-electrical device specifically designed to perform sieving operations. 
In [70] the device is presented only for MPQS sieving, but adjustment for the number 
field sieve should not be difficult. The device, called TWINKLE, is only in the con- 
ceptual stages at present. If built, it would be a cheap platform on which to perform 


sieving up to 500-1000 times faster than conventional workstations today. 


Notice that improvements like TWINKLE to sieving procedures are still subject 
to new advances in polynomial selection. Polynomials which have good yield do so 
independently of the sieving device used. The sieving device determines the amount 


of wall-clock time required to detect that yield. 


Large Prime Variations 


This time can be reduced considerably by accepting F-values which are almost smooth. 
Such a variation to sieving is known as a large prime variation. 

Large prime variations to MPQS are well known, see for example [49] and [7]. We 
have two smoothness bounds, Bı and Bə with Bı < Bə. During sieving, polynomial 
values are accepted if they contain at most two so called large primes between Bı and 


Bo, but are otherwise B¡-smooth. Relations containing no large primes are called full 
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relations. Relations containing exactly one or two large primes are called 1LP- and 
2LP-relations respectively. 1LP- and 2LP-relations are referred to collectively as large 
prime relations. 

In the same way that the primes at most Bı need to be “paired-up” to form squares, 
so do the large primes. In MPQS with two large primes (P?-MPQS) this is usually 
thought of as a graph theory problem. Let G = (V,E) be the graph whose vertices are 
the large primes appearing in the relations and the notional vertex 1, and for which 
{P,, P2} € E exactly when the large primes Pı and P> appear in the same large prime 
relation. A 1LP-relation is represented by the edge {P,,1}. Finding a fundamental 
cycle in G corresponds to finding a subset of the large prime relations in which every P; 
occurring across the subset does so exactly twice. Hence, the large primes are “squared 
up” by finding fundamental cycles in G. 

In MPQS with only one large prime (P-MPQS), the number of cycles obtained 
(cycles in P-MPQS are better thought of as “matches” ) as a function of the number of 
large prime relations gathered has been analysed ([49], [7], [55]). For P2-MPQS, this 
analysis is an open problem. One possibility for insight is the theory of random graphs 
(for example [60]). Most random graph models assume that the probability of a given 
edge occuring is uniform across E, however see [42] for a treatment with non-uniform 
edge probabilities. 

Large prime variations to the number field sieve are complicated by the presence 
of two polynomials, each with their own factor base. With smoothness bounds Bi and 
Bo, we consider an (i, j)LP-relation to be one for which F; is ¿LP-smooth and F> is 
jLP-smooth. We refer to such a variation with i < I and j < J as the number field 
sieve with (I+ J) large primes. In [30] an implementation is presented with I = J = 2. 
In subsequent chapters we deal mainly with each polynomial individually, so we refer 
just to a particular value being ¿LP-smooth. 

The problem of “squaring up” the large primes is complicated significantly in the 
number field sieve by the presence of two factor bases. Thinking about the problem in 
terms of cycle finding in graphs is less instructive than with MPQS. Nevertheless we 
still refer to a set of large prime relations in which every large prime occurring in each 
factor base does so exactly twice in that factor base, as a cycle. 

In [30] the number of cycles obtained as a function of the number of relations 
gathered is considered. The authors observe smooth growth in the number of cycles 
initially, followed by a sudden, almost vertical increase. This phenomena has become 
known as cycle explosion. Cycle explosion has been observed in other large factori- 
sations (for example, [23]). Indeed, practitioners now expect cycle explosion for large 
factorisations, and are able to tell that it is imminent (we do not elaborate on this). 
However, cycle explosion has not been analysed. Connections to the behaviour of ran- 
dom graphs have again been noted, but the existence of two factor bases complicates 


matters even more than the unresolved P?-MPQS case. It may be useful to think of 
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the number field sieve case in terms of a hypergraph whose edges span the distinct 
factor bases. If so, then it is worth noting that [41] extends Kovalenko's result [42] on 
non-uniform edge probabilities, to hypergraphs. 

Finally we note that in the RSA-140 and RSA-155 factorisations discussed in Chap- 
ter 6, the CWI siever has J = 3 for the non-linear polynomial Fi, and J = 2 for the 
linear F2. The AKL siever has I = J = 2, as in [30] and [23]. 


2.1.4 The Matrix Step 


Before discussing the matrix step itself, we mention an important pre-computation 
called filtering. Filtering is a process which reduces the amount of data entering the 
matrix, by removing less useful relations. For example, a relation involving an ideal 
that does not occur in any other relations, is useless. We do not elaborate on fil- 
tering here, see [33]. We do however, emphasize that (especially given the following 
remarks on troublesome matrix sizes) good filtering strategies are becoming increas- 
ingly important. For a description of the filtering strategy applied during the RSA-140 
factorisation, see [16] and [15]. For earlier considerations similar to filtering (aimed at 
reducing the number of relations per cycle amongst the large prime relations) see [26]. 

We now consider the matrix reduction itself. We are required to reduce a large 
sparse matrix over Zo. Peter Montgomery's implementation of the Blocked Lanczos 
Algorithm over Zə described in [33] and [52] addresses the problem. For earlier con- 
siderations see [44]. We do not give details of these algorithms. 

In practice however, the matrix reduction is becoming a bottleneck. In its present 
form, the Lanczos implementation runs only on a single machine. The matrices re- 
quiring reduction are very large. The problem is exacerbated using the number field 
sieve, as opposed to MPQS, because the number field sieve requires a factor base for 
each polynomial. In fact, the matrix reduction becomes a major practical issue for fur- 
ther record factorisations. Our improvements to polynomial selection however, have 
an impact on this problem, as do good filtering strategies. 

In the next section we meet the following heuristic guide to the expected size of 
the matrix for factoring some No. If S(N2, Nj) is the ratio of sieving effort required 
to factorise Nə compared to that of Ni, and M(Na, N¡) is the similar ratio for the 


relative matrix size, then 


M(No, N1) = VS(No, N1). (2.8) 


Extrapolation using the asymptotic run-time estimate Ly[1/3, (64/9)!/3] for sieving 


gives 
S(RSA-140, RSA-130) = 4, 
so we expect 


M(RSA-140, RSA-130) = 2. 
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The RSA-130 matrix had approximately 3.5 million rows and columns. A matrix of 
the expected size 7 million would be troublesome. However, the RSA-140 matrix had 
only 4.7 million rows and columns, and 4.7/3.5 = 1.3. 

How does this discrepancy between expected and actual size come about? Because 
of our improvements to polynomial selection used for the RSA-140 factorisation, we 
found that 


S(RSA-140, RSA-130) = 2. 
From this, we expect 
M(RSA-140, RSA-130) = V2, 


which is close to the value obtained. So, better polynomial selection not only decreases 
sieving time, but helps restrict the size of the matrix. 

Note however, that it is only possible to exploit the improved yield to restrict the 
matrix size if good filtering strategies are in place. For us this is certainly the case. In 
fact, sieving now continues longer than is strictly necessary, so that there is a bigger 
pool of relations. From this pool it is hoped that a combination of relations can be 
chosen by filtering to lead to a smaller (or less dense) matrix. 


Further extrapolation using the L-function gives 
S(RSA-155, RSA-140) = 7.0. 
This gives 
M(RSA-155, RSA-140) = 2.6, 


and a matrix for RSA-155 well in excess of 10 million rows and columns. This would 
be problematic. However, according to the analysis in Chapter 6, we made better 
use of our new polynomial selection methods for RSA-155 than for RSA-140, and so 
obtained a better polynomial (relatively speaking) for RSA-155. We can expect this 
to affect both the sieving effort and the size of the matrix favourably. 


Further possibilities regarding the matrix step are discussed in Chapter 7. 


2.1.5 The Square Root Algorithm 


The square root stage of the number field sieve was initially expected to be a technical 
difficulty for large N. However, thanks again to Peter Montgomery, the problem is 
essentially solved. 

An early method for extracting the relevant square root is outlined in [14]. The 
method relies on Hensel lifting. However the integers involved in the last few liftings 
are very large, so that even with fast multiplication techniques, the time taken to 


multiply them is prohibitive. 
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In [22] the author suggests a method based on the Chinese Remainder Theorem, 
that avoids these large multiplications. However, that method requires d = deg F to 
be odd. 

Montgomery's method relies on ideal arithmetic (in fact on arithmetic of fractional 
ideals). That is, it makes use of the fact that the ideal factorisation of each a — ba is 
known, to reduce the problem to a manageable size. Details can be found in [51] and 


[33], with some minor variations in [59]. 


2.2 Smooth Integers 


We now focus on the study of smooth integers, particularly from an analytic perspec- 
tive. Recall from Chapter 1 that we refer to an integer chosen uniformly at random 
from those at most r as a random value i,. In Chapters 3-6 we make extensive use 
and abuse of well-known results concerning the asymptotic probability that i, is r!/4- 
smooth (for fixed u > 1 as r > oo). In Section 2.2.1 we survey the relevant results on 
this probability. 

We then consider in Section 2.2.2 smooth integers in the context of the number 
field sieve. The focus is again on asymptotic considerations. We give an exposition of 
the asymptotic complexity analysis of the number field sieve. The aim is to understand 
the connection between the asymptotic run-time of the algorithm and the polynomial 


values which are required to be smooth. 


2.2.1 Smooth Integers Generally 


Let P;(n) denote the j-th largest factor of n. Also for x, y € Z let 
w(r, B) = Hn E€ Zt :n<r and A(n) < B}. 


For u € R with u > O we define 


pir, r") 


r= r 


for u> 1 (2.9) 


and p(u) = 1 otherwise. This is called the Dickman function, since Dickman studied 
in [27] the limit in (2.9). Think of p(u) as the asymptotic probability that a random 
value i, has its largest prime factor at most r!/“. That is, p(w) is the asymptotic 
probability that the random value i, is B-smooth with u = (log r)/ log B. 

The Dickman function is well studied, see [58] for a survey of the results. For our 
purposes it suffices to note the following. It is well known that the Dickman function 


satisfies 


p(r,r 1%) = rp(u) + AR +0 ( : ) (2.10) 
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where y is Euler's constant [40]. In our range of interest, the second term in (2.10) 
contributes to the second significant figure of p(w). 

We denote by P(r, B) the probability that a random value i, is B-smooth. Using 
(2.10) we obtain 


P(r, B) = plu) + (1) (2.11) 


as an approximation to P(r, B) which is adequate for our purposes. We use this 


approximation throughout Chapters 4-6. 


Calculating p(w) 


To do so, we need to be able to calculate p(w). The Dickman function satisfies the 


differential-difference equation 
up (u) + p(u—1)=0 (2.12) 


for u > 1 [27]. It follows immediately that p(2) = 1 — log2. It also follows that p(u) 


can be computed by numerical integration from 


pu) == [pita 


-1 
see [77]. This method is also used to compute p(u) in [39] and [40]. 

A more effective method is described in [3]. There it is noted that for integers 
l > 0 there exist analytic functions pl) (u) that agree with p(w) on the interval [1— 1,1]. 
Hence, we may obtain Taylor series expansions on those intervals. Moreover, given the 
Taylor expansion for p(w), the Taylor expansion for p+” (u) can be obtained. 


So, to calculate p(u) on [l — 1,1] we use 
PUE = Sore. 
i=0 


For | = 2 we have c) = 1 — log2 and © = 1/i2' for i > 1 (see [3]). Otherwise 


i 


for i > 0, and 


In [3] it is noted that calculating coefficients c for j = 1,...55 is sufficient to cal- 


culate p(u) to a relative error of about 10717. Throughout this thesis, when p(u) is 
computed we implicitly use the method of [3]. 
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Remark 2.2.1 New results of Bernstein suggest that computing tight upper and lower 
bounds on v(r, B) may become a viable alternative to computing p(u) [4]. Since full 


details are not yet available, we leave this as a subject of further study. 


Generalisations of the Dickman Function 


There are several generalisations of the Dickman function. Here we mention some that 
are useful for analysing the appearance of large prime relations, particularly 1LP- and 
2LP-relations. Since we do not make great use of these functions in what follows, we 
mention them here only briefly. 

A thorough analysis of the appearance of relations with up to h large primes requires 
we know something about the joint distribution of the sizes of the h + 1 largest factors 


of random values. Let 


y(r, Bie , Bf) 


Vk(r, B) 
{ne Zt :n< r and P;(n) < B; for j=1,... kh. 


Also, let 


plui,... uk) = pea). = lim (2.13) 


with u; = (log r)/log B;. Vershik investigated in [78] the limit (2.13), and it receives 
further attention in [10]. Think of py(u) as the asymptotic joint probability that a 
random value i, has its j-th largest prime factor at most r1/% for j=l1,... ,k. 

The functions p2(u) and p3(u) are particularly relevant to our 1LP- and 2LP- 
smoothness considerations. The Taylor series method for computing p(u) is extended 
in [3] to compute po(u). A further extension is given in [45] to compute p3(u). The 
details are best left to the references; we make use of these methods only very briefly 
in Chapter 4. 

For the most part, instead of calculating p2(u) and p3(u) we make use of a sim- 
plifying assumption. We assume that the appearance of P;(n) is independent of P;(n) 


for i = 1,... ,j —1. Clearly this is not true, but it does suffice for our purposes. 


2.2.2 Smooth Integers and the Number Field Sieve 


Recall from Chapter 1 that we define 
L,[v, w| = exp | (w + o(1)) (log x)” (log log 52) : 


Asymptotically the number field sieve is exciting because it achieves v = 1/3 whereas 
all previous algorithms achieve at best v = 1/2. As background to the polynomial 
selection problem, we should understand how the number field sieve achieves v = 1/3 
in its run-time estimate. We are concerned only with the time taken by the sieving 


stage, since this is the rate determining step. 
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The L-function arises in the analysis of general factorisation algorithms because it 
is connected to the optimal choice of parameters controlling the appearance of smooth 
integers. For a fixed smoothness bound, smaller integers are more likely to be smooth 
than larger integers. Ignoring for the moment the question of how the smooth values 
are collected, the better general factoring algorithms tend to be those which require 
smaller integers to be smooth. The following theorem (from [14], see [48] for another 


version) renders the L-function useful to our analysis by quantifying this relationship. 


Theorem 2.2.2 Let g(B) be a function defined on B > 2 for which g(B) > 1 and 
g(B) = B!+°0® as B=>00. Then as z + o, 


uniformly for B > 2. Moreover, 
a = Lo[1/2,V2] (2.14) 
if and only if 
B = Lo[1/2,1/v2] (2.15) 
as © — 00. 
The expression 
1g(B) 
TEZ) (2.16) 


measures the effort required to find at least B random values i, which are B-smooth 
(g(B) is an upper bound on the effort required to test each number for B-smoothness). 
Theorem 2.2.2 seeks the value of B that minimizes (2.16). The value (2.15) does so, 
and the minimum value of the required effort is given by (2.14). 

Thereom 2.2.2 is also the source of the heuristic guide to relative matrix size given 
in (2.8). The matrix has approximately B rows and columns. Hence, we can ex- 
pect a matrix of size approximately (2.15) rows and columns from a sieving effort of 


approximately (2.14) operations. Observing that 


£,(1/2,1/42 = (Lelt/2, va)" 


gives (2.8). 
Loosely stated, Theorem 2.2.2 says the following. If z = z(N) is the bound on 
integers which are required to be smooth by some algorithm A for factoring N, then 


with an optimal choice of parameters the asymptotic run-time of A is 


La[1/2, V2]. (2.17) 
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The exercise now becomes estimating x as a function of N. 
For example, MPQS has x = O(N"/?). Substituting this into (2.17) gives 


L,(1/2, V2] = exp | (v2 + 0(1)) (log N2)3 (log log N2)? | 
= exp |(1+0(1)) (own)? (ow Ez 
= exp [a + 0(1)) (log N)? (log log N)? | 

Lw[1/2,1], 


which is the heuristic asymptotic run-time of MPQS. The bound z = O(N!) is 
exponential in log N. By repeating the argument above it is clear that for all such 
exponential bounds z = O(N"/*) with k > 1, the run-time is Ly[1/2, \/2/k]. That is, 
no exponential bound on z will defeat v = 1/2. To do so requires a bound on x which 
is at worst sub-exponential. 

In the number field sieve, a sub-exponential bound on z is achieved. The extent of 


the sub-exponentiality we again measure using the L-function. The bound on z is 
a = Ly[2/3, (64/3)13]. (2.18) 


It is a simple matter of substitution to check that (2.18) combined with (2.17) gives 
the stated run-time for the number field sieve of Ly[1/3, (64/9)1/*]. It is not such a 


simple matter to derive (2.18). We outline this now. 


Remark 2.2.3 We assume for consistency with [14] that fı is chosen to be the monic 
base-m representation of N. That gives a; and m both O(N"/9). In practice we use 
non-monic fı of course, but this does not affect the asymptotic analysis (d > d+ 1 
as d — co). We drop the assumption when deducing practical guidelines for choosing 
fixed d in Chapter 3. 


We have variables N,d and B already defined, and we introduce U to be the 
maximum value of |a| and b across the sieve region. The idea of the analysis is that 
we first deduce (using Theorem 2.2.2), for fixed N and d, an asymptotic run-time for 
optimal choices of U and B. Then we choose a degree which minimizes, as a function 
of N as N — w, this run-time. The choices of U and d fix the size of the integers 
inspected for smoothness. 

In essence, the integers inspected for smoothness are forced to be sub-exponential 
in log N by increasing d, very slowly, as a function of N. The compromise for d is 
between a high rate of change of f at high d, and large coefficients at low d. 


The main steps are as follows. The values required to be smooth are bounded by 


a(N) = Fı (a,b) - Fo(a,b) < (d+ 1)m?U%? < 2dm UT! (2.19) 
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Given Remark 2.2.3, (2.19) gives 
Fi (a,b) - F>(a,b) < 2AN? UTH, (2.20) 


Assume for the moment that values F; (a,b) - Fa(a,b) are as likely to be smooth as 
random integers of the same size. In practice we will rely on this not being true, but 
the assumption suffices asymptotically. Using this assumption, Theorem 2.2.2, and 
(2.20), it is shown in [14] that optimal choices of B and U ensure that the asymptotic 


run-time is 
exp ja +0(1)) (aga + 4/ (dlog d)? + 4log(N1/4) log lox) ) | (2.21) 


Now we need to choose d to minimize (2.21). As is pointed out in [14], the minimum 


value of (2.21) must occur when 
(dlogd)? = O (log) log log(N'/2)) (2.22) 
Intuitively, it is clear that d ought to have the form 
O (dog NY (log log N)*) (2.23) 


since the appearance of d on the left hand side of (2.22) must match the form of the 
right hand side. Assuming d is of that form and ignoring for the moment the implicit 


constants, gives 


(dlog dy? 
log(N1/2) log log(N"/9) 


(log N)” (log log N)?FFV, and 
(log N)! (log log N)'~*, 


Q 


Q 


from which we obtain j = 1/3 and k = —1/3. Substituting these values yields the 
constant implicit in (2.23). 


The result is that the optimal value of d as N — co is 


log N 1/3 
de (313 + o(1)) (zag) | (2.24) 


At fixed N in our range of interest, some information is lost in the approximations 
and assumptions leading to (2.24). For the asymptotic result however, (2.24) is use- 
ful; substituting the optimal value of d back into (2.21) gives the Ly[1/3, (64/9)!/3] 
estimate. Substituting the optimal values of U and B into (2.20) leads, after some 
manipulation, to (2.18). 

Hence, in essence, the number field sieve defeats other algorithms asymptotically 
because the size of the values required to be smooth is sub-exponential in log N. This is 
guaranteed asymptotically by controlling, through d, the size of the relevant polynomial 


values, thereby encouraging the polynomials to produce more smooth values. 
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Leveraging this advantage in practice requires using polynomials which do indeed 
output many smooth values. Asymptotically the advantage comes from increasing d 
as N — oo. Differences in yield between polynomials of fixed degree do not affect the 
asymptotics. Indeed, not even the difference between monic and non-monic F affects 
the asymptotics. We should not ignore what is revealed in the asymptotics, neither 


will we ignore what is hidden. 


2.3 Polynomial Selection 


From Sections 2.1 and 2.2 we understand that (and why) smooth polynomial values 
are crucial to the performance, in practice and asymptotically, of the number field 
sieve. So we arrive at the polynomial selection problem. That is, given N, how do we 
find polynomial pairs for N which produce many smooth values ? 

We distinguish two aspects of this problem; generating candidate pairs at all, and 
generating good candidate pairs. We saw in Section 2.1 that to ensure the squares 
obtained from smooth values of F; and Fy are congruent mod N, fı and fo must have 
a common root mod N. This requirement makes the first aspect, generating candidate 
pairs at all, non-trivial. Our focus in this thesis is on the second aspect, generating 
good candidate pairs. We will assume procedures for generating pairs satisfying the 
relevant requirements. The main purpose of this section is to describe such procedures, 
that is, procedures addressing the first aspect. 

There is an obvious method for generating fi, fo with a common root mod N. It 
is called the base-m method, and was suggested for use in the number field sieve in 
[14]. We saw this method briefly in Section 1.3.3. With d = deg F; fixed in advance 
and m = O(NV(d+N), the coefficients of (non-monic) fı are taken from the base-m 
expansion of N and are therefore expected to satisfy a; = O(N 1/ U), 'Then fo is the 
linear polynomial 1 —m. The base-m method is restrictive, in that m must be small to 
keep fo and the coefficients of fı small. In general, there may exist many more pairs 
fi, fo with an arbitrarily large common root m mod N. Since both such polynomials 
are likely to be non-linear, we refer to methods giving these polynomials as non-linear 
methods. 

In Section 2.3.1 we describe what little known is about non-linear selection methods. 
It transpires that the base-m method is still the best known method for large N. We 


give more background on the base-m method in Section 2.3.2. 


2.3.1 Non-linear Methods 


The first step in non-linear selection methods is an algorithm due to Peter Montgomery, 
reported in [33]. It finds pairs of quadratic polynomials with a common root m mod 
N, each of whose coefficients are O(N'/*). Analysis of the algorithm reveals that this 
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O(N1/4) is O(N1/24) at d = 2. We call this method Montgomery's Two Quadratics 
method. 

Let (dy, dą) = (deg fi, deg f2), and dr = dı + dą. We refer to (dy, dz) as the degree 
pair for fi, fo. Montgomery’s Two Quadratics method gives the degree pair (2, 2). 
Since dy = 4, the comparable base-m pair is (3,1). For the integers we consider in 
Chapter 6, (5,1) is the appropriate base-m degree pair. Whilst we might expect two 
quadratic polynomials to be competitive with cubic base-m pairs, we cannot expect a 
pair of quadratic polynomials to be competitive beyond, say, integers of length 110-120 
digits. 

There are however, prospects of extending Montgomery’s Two Quadratics method 
to higher degrees. In particular, we seek two polynomials each of degree d and each 
of whose coefficients are O(N"/24). We are most interested in the degree pair (3,3), 
since this may be competitive with (5, 1) base-m polynomial pairs. Below we describe 


Montgomery's Two Quadratics method, and possibilities for extension. 


Montgomery’s Two Quadratics Method 


The description given here is essentially reproduced from [33] with some details omit- 
ted. Suppose we have two quadratic polynomials f(x) = azz? + ayz + ag and 
falx) = box? + biz + bo in Z|X]. Let 


ao bo 
a= a1 and b = bi 
ag bo 


The key observation is this: fı and fo have a common root m modulo N if and only 
if a and b are orthogonal (over Zy with respect to the standard inner product) to the 


vector 


The elements of c form a geometric progression over Zy with ratio m. The space 
orthogonal to a and b has rank 1 (see [33]), therefore any c in that space whose 
elements are in the same progression will suffice to generate the space. 

So, Montgomery begins with such a vector c, and then constructs a basis for the 
space orthogonal to c. Indeed, if p is a prime such that p < VN, the Legendre symbol 
(N/p) = 1, and cy a square root mod p of N with |c — N'/?| < p/2, then 


co p 
c= C1 = C1 
cz (cj — N)/p 


is a suitable c with c; = O(N'/?). The ratio of the elements of c over Zy is m = 
cj" 1 
large values mod N. 


mod N. The multiplication by p * mod N is what causes m to take arbitrarily 
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The following vectors a’ and b’ are both orthogonal over Zy to c, and in fact span 
the sub-lattice of Z? orthogonal to c. We have 


C1 (c1 (cs mod p) — c2)/p 
a = | —p and b = —(c2s mod p) 
0 1 


By reducing the basis {a’, b’} a basis fa, b) can be found for which 
lla]! - bl] = O(lell) = O(N”). 


In practice both ||a|| and ||b|| are O(VY/%). Each p gives a distinct pair of polynomials, 


so what remains is to search amongst many pairs of polynomials, for the best ones. 


Remark 2.3.1 The number field sieve can be generalised to use k polynomials all 
with a common root m mod N. An early but highly theoretical suggestion along 
these lines is contained in [20|. A more practical version is suggested in [32]. Crucial 
to the success of this suggestion is an adequate means of polynomial selection. The 
implementation described in [32] uses Montgomery's Two Quadratics method with 
small coprime linear combinations of the polynomials found. We refer to this scheme 


again briefly in Chapters 4 and 7. 


Extensions to Higher Degree 


The fact which endears quadratic polynomials to Montgomery’s construction is that 
the space orthogonal to a and b has rank 1. In general, we desire two polynomials of 
degree d with coefficient vectors a, b € Z%*!. The space orthogonal to a and b has 
rank d— 1. So now we need d — 1 polynomials whose coefficient vectors (of length 
d +1) are mutually orthogonal to the same geometric progression mod N. 

Montgomery suggests generating them in the following manner [53]. Suppose we 
begin with a single vector c € Z?¢~! whose coefficients are in geometric progression 
modulo N. The d—1 coefficient vectors are now read off from c; they are precisely the 
sequences of length d + 1 consisting of consecutive elements of c. We need to enforce 
some restriction on ||e||, to control ||a|| - ||b||. We again assume that in practice ||a|| = 
\|b||. That being the case we need c; = O(N!~'/4) to ensure that a; = b; = O(N"/9) 
if a and b are constructed in this way. 

Hence, polynomials with (d,d) degree pairs would follow from the construction 
of small geometric progressions c € Z?4-! mod N (where small means each c; = 
O(N'-1/4)), At d = 3, we require geometric progression mod N of length 5 with each 
ci = O(N?/). Initial experiments and counting arguments suggested that for large 
N, such progressions could be difficult to find. Hence we do not pursue this here (see 


Chapter 7). 
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Notice that the “small geometric progressions” construction gives deg f¡ = deg fo. 
Other combinations may be preferable, for example with dr = 6 the pair (4,2) could 


be useful. Methods to generate such polynomials are not known. 


2.3.2 The Base-m Method 


For large N therefore, the base-m method is still the method of choice. The base- 
m method is very simple, here we describe it and some existential arguments in the 
literature concerning approximately optimal choices of such polynomials. 


Let the coefficients of the base-m representation of N be ar”, That is, 


with 0 < a™ < m. More generally, the base-m representation of kN for some small 
k € Z can be taken. For ease of expositition we assume that just the representation of 
N is taken. 

It is not necessary that the polynomial F(x, 1) = f(x) be the true base-m expansion 
of N, simply that 


f(m) = 0 mod N. (2.25) 


Any alteration can be made to the coefficients of f provided the property (2.25) is pre- 
served. An alteration which leaves f with smaller coefficients is, at least heuristically, 


useful. In particular, if a; > |[m/2| then making the replacements 


a; > aj=m and 


üa > A4yitl (2.26) 


leaves the representation with smaller a;, whilst preserving (2.25). We assume below 
that f has been reduced in this way, working from i = 0,... ,d through the coeffi- 
cients. Note that some authors perform a LLL reduction [46] on the lattice created 
by transformations of this type in the hope of finding slightly smaller coefficients (see 
[80] and [85]). We omit this computation and proceed simply with the adjustment at 
(2.26). More details of our procedures emerge in Chapter 5. 

A slight variation on the base-m method is suggested in [14]. The suggestion uses 
the full “homogeneity” of Fi and Fo. That is, instead of fixing m as a root mod 
N of fi(x), fix (m,,ma) as a root mod N of Fi(z,y). Then Fo(z,y) is given by 
Fo(x,y) = max — my. The advantage in doing so is that some of the size borne by a 
single value m can be shared across mı and mo, and so across the coefficients of Fz. 


Essentially, this is the analogue for F3 of using non-monic F}. 
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Remark 2.3.2 Unfortunately, choosing good base-m polynomials is already a difficult 
problem, so choosing good base-(m,,ma) polynomials has received little attention in 
the literature. It is noted in [5] however that this method could be re-considered should 
improvements be made to the choice of base-m polynomials that make it worthwhile 
(see Chapter 7). 


For the moment then, we are stuck with (possibly modified) base-m polynomials. 
How good can we expect these polynomials to be? There is some discussion on this 
question in [14] and [5]. Next we summarise these discussions, with some adjustments. 

Suppose max |a;| < A and m < M. We have two requirements of F; and F>; that Fy 
and Fy have a common root mod N and that this should hold for allintegers 1,... , N. 
(The argument that follows can be adjusted for the case that the common root is 
required only for some positive n < N, and this corresponds to the special number 
field sieve). The common root requirement means that, in particular, fılm) > N, 


which gives 
O(AM?) > N. (2.27) 


The requirement that this hold for all integers 1,... ,N means that the number of 
integers representable by the possible fo at possible m must exceed N. That is, 


O(A®!M) > N. (2.28) 


Let A= NF and M = N”. In [14] and [5] the argument is that for fixed N and d, 
some guidance can be obtained on optimal values for u and v by requiring (2.27) and 
(2.28) to be equalities. Doing so, and solving for u and v, gives 

d-1 d 


and v= 


H= ee aa or me 


Notice that these values differ from those in [14], because there the authors consider 
the prospects for base-(m1, M2) polynomials, whereas we do not yet permit non-monic 
fo. 

For example, with N = RSA-140, we can hope to obtain at best A < 10193 if 
M < 107*!. That is, coefficients approximately five digits smaller than m. In effect, 
our methods achieve this. As we see in Chapter 6, using root properties we can 
effectively shave up to three digits from the coefficients of F. Simultaneously, we save 
approximately up to two digits by having regard to the size of the values taken by F1. 


Now we investigate how this comes about. 


Chapter 3 


Properties which Influence Yield 


In this chapter and the next we study polynomial yield. This chapter establishes a 
framework by parameterising the effects of the relevant properties. The next chapter 
uses this framework to investigate the influence of the properties more thoroughly. 

As noted in Chapter 1, there are two factors which influence the yield of a given 
number field sieve polynomial F. We call the factors size and root properties. By size 
we refer to the magnitude of the values taken by F1. By root properties we refer to 
the distribution of the roots of Fi modulo small p*, for p prime and k > 1. We are 
interested in the effect of root properties on the likelihood of F values being smooth. 
In short, if F} has many roots modulo small p*, values taken by F, “behave” as if they 
are smaller than they actually are. That is, on average, the likelihood of F values 
being smooth is increased. 

It has always been well understood that size affects the yield of Fi. The influence of 
root properties however, has not previously been either well understood or adequately 
exploited. Hence in this chapter we focus more on root properties than size. 

In Section 3.1 however we do consider briefly an issue regarding size that is peculiar 
to the number field sieve. In particular, we discuss the choice of polynomial degree d 
for given N. 

In Section 3.2 we lay the foundations for quantifying the effect of root properties. 
Recall from Section 1.4.1 that we use the term random value i, to refer to an integer 
chosen uniformly at random from {i € Z:1<i<r}. Here, the effect of root properties 
is quantified by comparing polynomial values v to random values i, with r = v. We give 
a heuristic estimate of the expected contribution of each prime p to each value. That 
is, we estimate the average exponent of p appearing in a sample of factorisations. The 
expected contributions of each p are different for F-values compared to random values. 
This gives a means of assessing the “behaviour” of the typical F-value compared to a 
random value of the same size. From this, we quantify the effect of root properties. 

This is an adaptation of an approach used in the analysis of the continued fraction 
method and of MPQS, which we discuss briefly in Section 3.2. We then calculate 


contributions of p in some relevant cases, check empirically the validity of the estima- 
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tions, and so deduce our parameter a(F') which is used to quantify the effect of root 
properties. Finally, we consider root properties with respect to the degree of F. In the 
quadratic case, we demonstrate the significance of attention to root properties. We 
also examine the average root structure for polynomials of higher degrees. 


Section 3.3 contains a summary of this chapter. 


3.1 Size 


The manner in which size influences yield is clear. We saw in Section 2.2.1 that the 
smoothness probability of random values ip, as a function of r, is well understood. 
Hence, given N and d, the exercise in choosing (F3, F2) with good size is clear: the 
size of the values over which sieving is to occur should be kept small. 

However, d of course is not fixed initially. We saw in Section 2.2.1 that the key to the 
asymptotic performance of the number field sieve is that the degree of Fi is optimised 
to minimize the run-time of sieving. These however, are asymptotic considerations. 
Here we consider in more detail, what is the best choice of d for N in the current range 
of interest and for the polynomials we currently use. 

First we recall some details from Section 2.2.2. Assume that we are working with 
base-m polynomials. So F;(z,y) is the non-linear polynomial and F (x,y) the linear 
polynomial. If U is an upper bound for the values |a| and b defining the sieve region, 
then 


F, (a,b) - Fa(a,b) < 2dm?U**! (3.1) 


is an upper bound on the values inspected for smoothness during sieving. Assuming 


that Fi is monic, and therefore that m ~ NT, this gives 
F,(a,b) - F>(a,b) < 2dN2/4U41, (3.2) 


Using (3.2), it is shown in [14] that optimal choices of U and B ensure the run-time 
of the number field sieve, with d and N fixed does not exceed 


exp (a + o(1)) (dosa + 4/ (dlogd)? + 4log(N"/4) log ion") ) (3.3) 


Choosing d to minimize (3.3) gives the optimal choice of d asymptotically as 


d= (313 + o(1)) A | (3.4) 


As d > oo, which it does very slowly, (3.4) is a useful indication of the appropriate 
value of d. But at small d, some of the approximations leading from (3.3) to (3.4) may 
be misleading. Moreover, (3.3) carries the assumption from (3.2) that Fi is monic. 


Whilst this makes no difference asymptotically, it may affect the ranges of appropriate 
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d when d is small. Hence, below we re-write (3.3) for the non-monic case and consider 
the new expression for small d and for N in in the range of interest. 


Using non-monic Fi, the upper bound deduced from (3.1) becomes 
F;(a,b) - F>(a,b) < 24N OHD YH, (3.5) 


Using (3.5) in place of (3.2) and repeating the argument from [14] which leads to (3.3) 


gives that the time taken for the number field sieve to factorise N is at worst given by 
exp (a + o(1)) (os d+ 4/ (dlog d)? + 4log(N1/(@+)) log log(N1/0+0)) ) , (3.6) 


using a value for d which minimizes the expression. 


That is, to factorise N we should use d which minimizes 
E(d, N) = dlogd+ 4/ (dlog d)? + 4log(N/(4+D) log log(N1/(d+D)) . 


Table 3.1 gives values of E(d, N) for d and N in the range of interest. The values N; in 
the table are the integers 10’~!, that is, integers with 7 digits. For each N; the optimal 
value of d is bolded. For the purposes of illustration, the table begins with integers of 
length 80 digits. Beware that for integers up to, say, 110 digits long, Montgomery’s 
Two-Quadratics method may be preferable to the base-m method. 

Table 3.1 shows that the relevant degrees are d = 4,5,6. The cut-off between d = 4 
and d = 5 is at approximately 120 digits. The cut-off between d = 5 and d = 6 is 
at approximately 220 digits (these figures of course should be used only as a rough 
guide). 


Remark 3.1.1 We have considered only base-m polynomials in this section. That is, 
we have pairs of polynomials (F1, F2) with degree pair (d, 1). It is entirely possible, in 
fact probably true, that other combinations of degree are preferable. For example, the 
degree combination (2, 4) is likely to be preferable to (5, 1). As noted in Section 2.3.1, 
since we do not presently know how to generate such polynomial pairs with a common 


root mod N, we do not consider such possibilities here. 


3.2 Root Properties 


We turn now to the main concern of this chapter, root properties. In Section 3.2.1 
we explain the model we use to quantify root properties, the so-called typical F-value 
model. In Section 3.2.2 we estimate in general the key quantity in this model. In 
Section 3.2.3 we use this estimate to construct the parameter a(F') which quantifies 
the average effect of root properties in F-values. That completes the parameterisation 
of root properties. In Section 3.2.4 we go on to consider the effect on root properties 


of varying d, for the relevant cases d = 4, 5, 6. 
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Table 3.1: E(d, N) at relevant d and N 


3.2.1 The Typical F-value 


Ideas similar to the “typical F-value” analysis presented here for the number field 
sieve, have previously been introduced for analysis of MPQS [6] and the continued 
fractions method [39]. The Knuth-Schroeppel analysis of [39] examines the use of small 
multipliers k in the continued fractions method, Boender’s analysis in [6] extends this 
in the context of small multipliers for MPQS. 

The situation regarding small multipliers is similar for the continued fractions 
method to that for MPQS, so we elaborate only on MPQS. In MPQS, kN for some 
small k may be a quadratic residue for more small p than N is. The benefit is an 
increased likelihood that values of the relevant quadratic polynomial will be smooth, 
the cost is that now the integer to be factorised is larger. Hence, analysis is required 
to choose k so that the benefit exceeds the cost, hopefully optimally. 

The idea which we distil from [39] and [6] is that it is useful to examine the quantity 


which we refer to as cont,(v). 


Definition 3.2.1 Denote by ordpv the exponent of the largest power of p dividing v. 
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Then cont,(v) is the expected value of ordpv as v ranges across some sample S. 


So, on average, v € S looks (across the primes at most B) like 


logv = > contp(v) - log p . (3.7) 
p<B 
In the special case where S is a set of F-values v, we denote cont,(v) by cont,(F') and 
refer to the exponential of the value in (3.7) as the typical F'-value. 

In [39] cont (kN) is called f(p,kN). In [6] a comparison is made between cont,(v) 
for values v of quadratic MPQS polynomials and for random values v = i, (although 
in [6] the terminology is different). Here we make a similar comparison; we compare 
the typical F-value, for number field sieve polynomials, to the typical random value. 

Notice that cont,(v) for v € S is easy to check empirically. For sufficiently large S 
we expect 


contp(v) = Daves Ordpv ; (3.8) 
[S| 

In particular, for F-values, cont,(F') can be determined by factorising a small, but not 
too small, set of F-values in the appropriate range. For most p however, we can do 
better by giving a heuristic explicit form for cont,(F’). The primes p for which we can 
do this are precisely those for which we can assume that the full contribution of p in 
a given F-value is associated with a single contribution from a single root mod p of 
F. That is, the primes p which are unramified. Ramified primes must divide A, the 
discriminant of f (see for example [18]). As a coarse filter on ramified primes, we refer 
to p for which p|A as poorly-behaved primes, otherwise p is well-behaved. 

In the following subsection we give heuristic estimates of cont, in the relevant 
cases, for well-behaved primes. Contributions of poorly-behaved primes p could be 
obtained by computing the ideal decomposition of (p)([18] Section 6.2). However, we 
find in practice it is simpler to compute these contributions directly from a sample of 


factorisations, as in (3.8). 


3.2.2 Estimating cont,(v) 


It is useful to distinguish three cases; the random value ip, polynomial values of the 
form F(z, 1) = f(x) = v and polynomial values of the more general form F(x, y) =v. 
We develop estimates of cont,(v) in these cases from ideas due to Peter Montgomery. 
Consider a random value ¿,. It is possible that powers p* for integer k > 1 also 
divide ż,, so we expect p to appear as 
icc ee 


in ip. Hence, we take the average contribution of p to 1, to be 


1 
¡E 3.9 
contp(ir) = — (3.9) 
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Even though (3.9) is merely a heuristic estimate it works well in practice. Table 3.2 
shows estimated and actual contributions of p < 50 in a total of 10° integers chosen 


uniformly at random in the interval [107°, 1071). 


Actual | Estimate 


100213 | 100000 
50280 50000 
25062 25000 
16808 16667 
10118 10000 
8196 8333 
6202 6250 
5529 5556 
4590 4545 
3629 3571 
3333 3333 
2786 2778 
2446 2500 
2401 2381 
2263 2174 


Table 3.2: Actual and expected contributions of p in ip for p < 50 


Consider polynomial values of the form f(x) = F(x,1) = v. These values are of 
use both as an easier version of the more general case, and are of interest in their 
own right when examining line sieving. Now, since each root mod p corresponds to a 
unique root mod higher powers of p by Hensel lifting, the full contribution from each 
root is 1/(p — 1). Think of each root mod p as a distinct opportunity for an f-value 
to be divisible by p. If there are q, distinct roots of f mod p then we take the full 
contribution of p to the typical f-value to be 

contp(f) = a ; (3.10) 
Computational evidence for (3.10) being a good estimate appears after we discuss the 
next case. 

Consider now polynomial values of the form F(x,y) for coprime z and y. We no 
longer have a unique correspondence between roots of F(x,1) mod p and roots mod 
p? for k > 1. Moreover, an extra class of roots emerges from the possibility that 
ply. Indeed, if also plag then p|F(x,y) since F is homogeneous. We call these roots 
projective roots. 

Let qp now be the number of roots mod p of F(x,y). That is, qp includes the roots 
x/y of F(x,1) mod p and projective roots. The full contribution of p to the typical 
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value F(x,y) = v with x and y coprime is given by 


p 
BZ 


contp(F') = qp (3.11) 
To see this, we will count the contribution of p* for some fixed k € Zt, then sum 
these contributions over k. 
Since F is homogeneous, think of the coprime pairs (x, y) as points on the projective 
line. For the purposes of counting the different combinations of x and y it is useful to 
consider classes of points as follows. 


There are three cases, labelled “s”, “0”, “oo”: 


1. Case 1: x/y = “s” for some s € Zp with s Æ 0 mod p (that is, neither x nor y 
are divisible by p). 


2. Case 2: x/y = “0”, that is, z = 0 mod p. 


3. Case 3: x/y = “oo”, that is, y= 0 mod p. 


>» 


Now count the number of classes which fall into each case. 


1. Case 1: There is one class in Case 1 for each s € Z,» not divisible by p, so there 


are p(p*) = p*-!(p — 1) classes in Case 1. 


2. Case 2: There is one class in Case 2 for each value x € Zw divisible by p, so 


there are p*"! classes in Case 2. 
3. Case 3: Similarly to 2, there are p*"" classes in Case 3. 


So there are a total of p*"!(p + 1) classes from Cases 1-3. 


Each class has the same number of points (x, y) contributing to it: 


1. Case 1: For fixed s and given some x € Zp not divisible by p, y is uniquely 
determined. So there are y(p") pairs contributing to each class (the class is 


determined by s) in Case 1. 


2. Case 2: For a fixed value of a = 0 mod p, y may take any invertible value in 
Zk, so there are y(p*) pairs contributing to each class (the class is determined 
by x) in Case 2. 


3. Case 3: Similarly to 2, there are p(p*) pairs contributing to each class in Case 
3. 


Hence, a coprime pair (x, y) € Zp x Zw selected uniformly at random will fall into 
a particular class with probability 


1 
pttp +1) 
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Precisely q, of these classes correspond to roots mod p of F(x,y) so the probability 


that such a pair actually contributes p* is 


dp 
pd) 


Of course, one p-th of these contributions will be counted again when we count the 


contribution from p*t!, so the contribution only from p* is 


= ZE (1 _ 2) 
POWER) p 
Logarithmically, p* contributes k appearances of p, so we take the full contribution of 


p to be given by 


= kq 1 
cont, (F’) = y -x (1 = 2) 
S p (p+ 1) p 
— dp -*\y k 
pin ( p 2-1 
—2 
CE: 
p p p+1 
bI 


Computational evidence for the estimates (3.10) and (3.11) is given in Table 3.3. 
This table contains estimated and actual contributions of p < 100 for well-behaved 
primes of a particular polynomial P,¡. Primes p at which F' has no roots are omitted. 
Polynomial P}; is a polynomial considered for the factorisation of RSA-130 (see Section 
6.1.1 and Appendix B). We considered 104 values of P,,. Only p = 2 is not well-behaved 
for Pią. We have repeated these counts on many polynomials and these results are 
typical. 

We conclude that estimates (3.9), (3.10) and (3.11) are good estimates of cont, for 


well-behaved primes in each case. 


3.2.3 Quantifying Root Properties 


It is now possible to make a comparison between F-values v and random values i, with 
U= lr. 

During sieving, notionally the full contribution of each prime p < B is removed 
from each value being sieved. In fact we start with the log of the value and subtract 


the log of each contribution. So after sieving a random value i, would appear as 


l 
logi, — X` “er (3.12) 
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1 
2 
1 
1 
2 
1 
1 
3 
2 
5 
1 
1 
2 


Table 3.3: Actual and expected contributions of p < 100 for Pn 


Each polynomial value F(x,y) =v or f(x) =v after sieving appears as 


logv — y cont,(v) - log p . (3.13) 
p<B 


In each case we call the difference between (3.12) and (3.13) the paramater a, so we 


have 
a=). sza. (v)| logp . 
pal a 


Over the well-behaved primes, (3.10) and (3.11) give 


af) = Nal, and 


p<B por 
p log p 
a(F) = Y (1-05) a 
5<B p+1)p-1 


Hence, for example in the latter case we have 
log F(z,y) = logi, + a(F) 


whereby we consider F-values to behave like random integers whose logarithm has 
been adjusted by a. That is, the value F(x,y) behaves like a random integer of size 
F(x,y): e). So if a(F) < 0 we consider F-values to be more likely to be smooth 


than random integers of the same size. 
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By inspection it is clear that a(F') receives its most negative contributions when 
dp is large for very small p. That is, a(F') is more negative when F has many roots 


modulo small p. 


3.2.4 Root Properties and d 


It emerges from Sections 2.2.2 and 3.1 that asymptotically and in practice, the choice 
of d = deg F is crucial to controlling the size of the values inspected for smoothness 
during sieving. Now, root properties are determined by the distribution of the roots of 
F in Z, for small p (that is, by the polynomial factorisation of F over Z,). Therefore, 
the choice of d will influence root properties as well. Quintic polynomials for example, 
can have more roots in Z, for p > 3 than quadratic polynomials, but (on average) do 
they? 

A thorough examination of this topic is not within the scope of this thesis, and in 
any event becomes less relevant in light of the procedures outlined in Chapter 5. In 
this subsection we intend only touching on some relevant and accessible considerations. 

Eventually in this subsection we consider polynomials with d = 4,5 and 6, as 
in Section 3.1. We use a simple model to estimate the average distribution of non- 
projective qp for base-m polynomials of these degrees. 

First though, we focus on d = 2 (that is, the polynomials produced by Mont- 
gomery's Two-Quadratics method, Section 2.3.1). The polynomials F(x,y) are now 
binary quadratic forms. The rich theory of binary quadratic forms provides results 
from which we prove that, on average, the odds are stacked against F having good 
root properties. This highlights the importance of having regard to root properties in 


polynomial selection. 


Quadratic Polynomials 


A binary form is said to represent some r € Z if there exist x,y € Z for which 


F(x,y) = r. We are interested in the case ged(z,y) = 1. 


Definition 3.2.2 A binary form F primitively represents some r € Z if there exist 


coprime integers x and y for which F(z,y) =r. 


The following theorem is a standard result from the theory of quadratic forms 
(see for example [13]). It gives necessary conditions on the primitive representation of 


integers by binary quadratic forms over Z. 


Theorem 3.2.3 Let F(x,y) = oz? + aqty + agy” be a quadratic form over Z and A 
its discriminant. Then F primitively represents r € Z only if there exists some s € Z 
for which 


s? = A mod 4r. (3.14) 
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As an immediate consequence of Theorem 3.2.3 we have 


Corollary 3.2.4 Let F be a binary quadratic form over Z with discriminant A, and 


let p be an odd prime not dividing A. If F primitively represents some r € Z and plr 


then 
> 
—|=1. 
p 


In general the converse of Theorem 3.2.3 is not quite true, but there is a slightly 
more general statement that holds. If a solution to (3.14) exists then some class of 
forms of discriminant A primitively represents r (again, see [13]). 

Using Corollary 3.2.4 we give a result leading to an estimate of the chance that 


(A/p) = 1 for a random assignment p of the coefficients (a2, a, ag) of F. 


Lemma 3.2.5 For each odd prime p coprime to A, the number of non-trivial 3-tuples 


(az, aj, a0) mod p for which (A/p) = 1 is 


(p? — 1). 


NISY 


Proof: Fix a; £ 0 mod p. For A = a? — 4azag and (A/p) = 1 we have 
azao = (—4) 1 (Xp — a?) mod p, (3.15) 


where xp is any of the (p — 1)/2 quadratic residues mod p. Hence the product azao 
may take any of (p — 1)/2 values mod p, exactly one of which will force the right hand 
side of (3.15) to be zero because exactly one Xp = a. 

For each of the (p — 3)/2 non-zero values of the right hand side of (3.15), there 
are p— 1 ordered pairs (a2, ag) whose product gives the right hand side, since for each 
non-zero a2, ag is uniquely determined by ag = az '(-4) 1 (xp — a?) mod p. 

For each single zero value of the right hand side of (3.15), there are 2p — 1 ordered 
pairs (as, ag) for which at least one of ag = 0 mod p or ag = 0 mod p holds. 


Hence, for non-zero a1, there are 


-3 
— (P-1)+2p-1 


ordered pairs (a2, ag) giving (A/p) = 1. There are p — 1 non-zero residue classes for 


@1, SO non-zero aj account for 
p=3 pP 1 
(p — 1) PF 0-1) + 2-1] = w-1)|F+5| 


tuples (az, a1, ag) mod p. 
Now, if a, = 0 mod p, we require 


azao = (—4)~'xp» mod p (3.16) 
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where again xp is any of the (p — 1)/2 quadratic residues mod p. Since the right hand 
side of (3.16) is always non-zero, there are p — 1 pairs (a2,ag) for each xq, giving a 
total of 


tuples (az, a1, ay) mod p for a, = 0 mod p. 


So, the total number of tuples is 


(p— 1) RZE 


Thus, for odd p, the probability that a uniformly random non-trivial selection 


(a2, aj ,ag) mod p satisfies (A/p) = 1 is given by 


Prob [(A/p) = 1] = ue 
= a) 
< ¿(-3) 


For odd p not dividing A, it is therefore more likely than not that (A/p) = —1, and the 
probability that (A/p) = 1 is smallest for smaller p. This highlights the significance 


of selecting polynomials which do have roots modulo small p. 


Higher Degree Polynomials 


We turn now to polynomials of higher degree. Recall from Section 3.1 that base-m 
polynomials of degree 4,5 and 6 are the most relevant for integers in the current range 
of interest. Unfortunately, when passing from d = 2 to higher degree, we lose the rigour 
of available results on quadratic forms. Instead, we now obtain information about the 
factorisation of the single variable polynomial f(x) mod p as a function of d (assuming 
p > d), from a result concerning the Galois group of a random polynomial f € Z{z]. 

Informally, the result is that most monic polynomials f € Zz] of degree d have 
Galois group isomorphic to the symmetric group on d elements, Sg. That being the 
case, the typical factorisation of f can be deduced by examining the space of possible 
cycle decompositions in Sg. 

Formally, the result is as follows. Let f(x) € Z|z] of degree d be monic. The Galois 
group of f, G(f), may be considered as a subgroup of Sg. The question is this: how 
many f have G(f) < Sa? 
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Theorem 3.2.6 Let E¿(A) be the number of monic polynomials f(x) € Zlx] with 
max(|ag|,... ,|ag|) < A for which G(f) < Sg. Then 


Eq(A) < AT"? logt™ A 
where e = e(d) > 0. 


For a proof see [34]. For more discussion on this and related results also see [19] 
and [24]. We typically have d = 5 and A = 10% with, notionally, 0 < a; < A. The 
total number of possible monic polynomials is A”. Theorem 3.2.6 gives 

Ea(A) 
A5 


Hence, we may safely assume that most monic polynomials we encounter have 


53-10, 


However, we search only amongst non-monic polynomials. The roots mod p of F 
that arise from ag > 1 are precisely the projective roots. As we see in Chapters 5 
and 6, projective roots are exploited to equip each F with better than average root 
properties. The parameter a(F') incorporates this effect. For now, we use Theorem 
3.2.6 as a guide only to the underlying non-projective root structure for each F. 

We now examine this structure, on the assumption that G(f) = Sa. A permutation 
o € Sq which is a product of k cycles of length /y,... „ly, (of course eae b =d) 
corresponds to a factorisation of f into k irreducible factors of respective degrees l; for 
1=1,... ,k. This assumes that each cycle is represented to its maximal length - that 
is, we do not for example break a 3-cycle into two transpositions. We refer to each 
possible set of l; with 4 l; = d as a distinct cycle structure in Sg. The exercise now 
is to count the occurrences of each cycle structure for d = 4,5,6. 

Table 3.4 shows the number of ways each possible cycle structure appears in Sa, 
Ss and S6. 

Each appearance of a cycle of length one (l; = 1) in a given cycle structure corre- 
sponds to a distinct root mod p of f. Hence, Table 3.5 collects for each d, the structures 
that give 0,... ,d roots of f. The frequency column records the frequency with which 
structures giving qp roots mod p occur as a fraction of the d! possibilities. 

Notice that qp > 1 on average 29% of the time when d = 4, 26% of the time when 
d = 5 and 27% of the time when d = 6. This indicates that the average set of non- 
projective root properties is best for d = 4 and worst for d =5, although the difference 
between them is not great. 

In any event, the procedure we present in Chapter 5 isolates polynomials with ex- 
ceptionally good root properties, not polynomials with average root properties. Whilst 
experimenting with different degrees, we observed that the best values of a(F') we found 
do not vary much across d = 4,5,6. Hence we conclude that in practice, the choice of 


degree should be determined mainly by size considerations, not by root properties. 
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3.3 Summary 


In this chapter we have identified and described the properties which influence yield. 

The influence of size on yield is, for the most part, well known. However, the 
problem of choosing the degrees of (F,, F2) is unique to the number field sieve. Here we 
have verified that d = 4,5, 6 are the relevant degrees for non-monic base-m polynomials 
and N in the range of interest. For most of this range, d = 5 is the best degree for F}. 

To assess the influence of root properties on yield it is necessary to derive a param- 
eter which quantifies their effect. We use a “typical F-value” model, the crux of which 
is to estimate accurately the quantity contp(v) for certain values v. After estimating 
cont,(v) we have constructed a parameter a(F') which measures root properties in the 
following sense: due to root properties, F-values F(x, y) behave as if they are random 
values of size F(x, y)-e("). The idea now is to seek polynomials with a(F) < 0. 

We have also considered the choice of d as an influence on root properties. We find 
that, on average and over the non-projective roots, d = 4 is better than d = 6 and 
d =5, in that order. However, the difference is not so great that these considerations 
should enter into the choice of d. 

It is instructive at this point to hint at the benefit obtained from understanding 
root properties. Using the procedures of Chapter 5, it is not uncommon to find non- 
monic quintic Fi (with common root, say, mı) for N with a(F1) ~ —7. Indeed, that 
is the case with the polynomial used for the RSA-140 factorisation. How much benefit 
does this return? 

Since e” = 1000, values of such F} behave as if they are 1/1000 their actual value. 
Suppose we attempted to reap the same reward, naively, by shaving a factor of 1000 
from each coefficient of Fy. Then we would have a polynomial whose coefficients are 
of the size expected for a random choice of polynomial with mz œ~ m1/1000. Since 
m = N1/(1+D the new polynomial has coefficients of the size expected from a random 


choice for 


N 
Nə x mit! = 1 
2 mm2 1000%*T" 


That is, No ~ 10 *N4. The benefit from root properties alone, once quantified, is 
that the polynomials we find have yields expected from a random choice of polynomial 
for an integer 19 digits smaller than the integer we are trying to factorise. 

This is the influence of root properties alone. How does size interact with root 
properties? Once we have a means of quantifying root properties, we are drawn to 
questions of their interaction with size, and the influence of both properties on yield. 


These are typical of the questions we consider in the next chapter. 


3.3 Summary 


{li} Occurences 


1,1,1,1 
1,1,2 


1,1,1,1,1,1 
1,1,1,1,2 
1,1,1,3 


Table 3.4: Cycle structure counts in S4, S5 and Sé 
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Table 3.5: Relative frequencies of qp as a function of d. 


Chapter 4 


Modelling Yield 


This chapter is a computational study on polynomial yield. The aims are to ensure 
that our understanding of the properties which influence yield is correct, to extract 
some information on the benefit obtained from manipulating these properties, and to 
present a simple method for estimating yield. 

For the most part we study only simple cases in this chapter. In particular, we 
consider polynomials selected by Montgomery’s Two Quadratics method (see Section 
2.3.1), which have been adjusted for line sieving across F(z,1) = f(x). Line sieving 
over quadratic polynomials is certainly of interest in its own right (see [33] and [32]), 
and has the added advantages of being easier to visualise and tidier to analyse. Hence, 
unless stated otherwise, we assume in this chapter that F(x,y) is of degree two and 
has been chosen for sieving across f(x). 

Recall from the previous chapter that the parameter a( f) is constructed to quantify 
the effect on the typical f-value of root properties. Indeed, we take the value f(x) to 
behave like a random integer of size f(x) - ee, In Section 4.1 we ask “how much 
benefit can be obtained from exploiting root properties?”. That is, we examine yield 
as a function of a, with a in an achievable range. 

At this stage we use an established method of calculating yield. We adapt the 
method used by Boender in [6] to calculate yield of MPQS polynomials. Boender 
confirms in [6] that his method gives a reasonable approximation to the yield of such 
polynomials. We also adapt Boender’s method to consider 1LP- and 2LP- yields as a 
function of a. Finally, we conduct sieving experiments to confirm the predictions of 
this section. 

In Section 4.2 we use the fact that a correctly quantifies the effect of root properties 
to suggest a simple method of approximating yield. We test the estimate on several 
polynomials from [33]. We compute peak yields of these polynomials, and yield across 
the sieve region, then compare predicted yields with actual yields found by sieving. 
We then use our simpler model to examine yield due to root properties under condi- 
tions that we encounter whilst considering larger N and higher degree polynomials in 


subsequent chapters. 
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Section 4.3 contains a summary of this chapter. 


4.1 Yield as a Function of Root Properties 


We turn now to the influence of root properties on yield. Below we adapt Boender's 
method of calculating yield to our polynomials, and then compute variations in full, 
1LP- and 2LP- yields as a function of a. That is, we ask “all other things being equal, 


what is the influence of root properties on yield?”. 


4.1.1 In Theory: Boender’s Yield Estimation 


Boender’s approach is to use analytic estimates of the number of smooth integers in 
short intervals. Taking f to be a continuous curve on R, these estimates are computed 
on intervals in R sufficiently small to approximate the likelihood of a given point in 
that interval being an integer point on f. Estimates are then summed over many 


intervals. 


Smooth Integers in an Interval 


We require an estimate of the number smooth integers in an interval of a given size. 


For an integer n recall that P;(n) denotes the largest prime factor of n and that 
p(z, y) = |{n € Zt : n < x and Pi(n) < y}|. 


Then asymptotically 
p(u — 1) 
Z 1 — y —— 4.1 
lea) za (ow + 0-07) (41) 
where u = (log x)/log y, y is Euler’s constant and p(w) is the Dickman function (see 
Section 2.2.1). 
Now, for fixed e € (0, 1), the number of y-smooth integers in the interval [x, 7+2/z] 


is given by 
x 
log(1 + y/ log z) 
zlogy 


for x,y,z in the range x > 2 and 


1 loglog(1 +y) 
v(x, y) l + O E + KT ) 


(log log z)?*** < logy < (log2)?/P, 1 < z < R(z, y), 


where R(z,y) is an expression depending on x, y and some fixed constants (see [6]). 


Combining (4.1) and (4.2) and approximating some of the logarithms gives 
z x 
uć fy («+ 2 y) m ble, y)) = 
x z 


log l 1 log l 
(1 PI "p Hay) (1 gelato PEL æv) | (4.3) 
z logy 
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where o(z,y) is given by 


p(u — 1) 
= ¡A << 
oley) = plu) + (= 7) 
and the c;(e) are constants depending on e [6]. Boender notes that the range of interest 
for x,y,z slightly extends that for which (4.2) is proven to hold. Empirically however, 


(4.2) still provides a good approximation in the range of interest. 


Estimating Yield 


Suppose we are to sieve for B-smooth values of |f(x)| with z in the range [a;, a2]. We 
use the approximations at (4.3) to estimate the yield of f across [aq, ao]. 

Care is required in the calculations below when f (considered as a continuous curve 
on the real interval [a1,a2]) contains a stationary point or real roots in [a1, a2]. In our 
circumstances this is always the case. Clearly each curve f can be cut into segments 
which exclude roots and turning points (we require at most four segments). We call 
the segment so obtained which occupies the largest portion of [a,, a2] the principal 
segment. For each curve below we have repeated our calculations on every segment 
of the curve, and obtained almost identical results on each segment. Hence we report 
only the results on the principal segment. 

Let I be the real z-interval defining the principal segment, and let I be the conti- 
nous curve defined by f on J. Since I contains no turning point in J, we can assume 
either f'(x) < 0 or f'(x) > 0 for all z € I. We assume the latter, the former only 
requires sign changes in the arguments below. Similarly, we assume f(x) > 0 for all 
x € I. The question now is ‘how many integer points on I are B-smooth?’. 

We approximate the number of B-smooth integer values on I by cutting I into 
shorter intervals and summing the yield over these intervals. Let Sı and So be the 
minimum and maximum values respectively, taken by IT on J. Cut [S1, S2] into K 
subintervals [y;, yi41] for i =0,... , K — 1 by taking 

log S2 — log S1 
h= O 
so yi = Sqe'?. In accordance with our notation for estimating the number of smooth 
integers in an interval, we write yji1 = Yi + yi/z where 1/z = e — 1. 

Now, for each y;, let x; € R be such that (x;,y;) ET. Let 
Yit1 — Vi 
Ti+1 — Ti 


Si = 


denote the slope of [ on [a;, 1;+1], and let t(y;) denote the number of B-smooth f- 
values on I with y < y;. Clearly the yield on the whole of I, Xf, is given by 


K-1 


Xy = Y (tlui) — t(yi))- (4.4) 


1=0 
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For y € [yi, Yi+1] the probability that a randomly chosen (x,y) € IT has x € Z is 
approximately 1/s;. So we have 


t(Yir1) —tlyi) ~ PSB) = (ziy1 — ti)P(fi, B) 


where P(f;, B) is the probability that an integer f-value in [y;, y;+1] is B-smooth. 
Recall that we consider f-values f(x) as likely to be B-smooth as random integers 
of logarithm log(f(x)) + a(f) where 


1 

a(f) = y (= — cont, (7) log p. 
p<B 

So, if gi(a) = logy; + a = log Sı + ih + a, and if v;(a) = g;(a)/ log B, approximation 

(4.3) yields 


t(Yi+1) — tlui) © 


(ti = 3) (1 EAA) (pita) + NOA) 


log B gi(a) 
Ci  CologlogB 
1+—-+———_ |. 4. 
x mT DEB ) (4.5) 


Approximation (4.5) and equation (4.4) give an approximation to X'y. 


4.1.2 Full Yield as a Function of a 


We now consider the full yield X; as a function of a. For B fixed, a( f) is bounded. 
In fact for B = 5- 10° with quadratic f we have approximately |a| < 14.16. However 
for the quadratic polynomials investigated, typically a € [-3,1], a range of 4. So we 
consider a € [—4,0] and refer to this as the practical range for a. Note that when 
we consider higher degree polynomials for larger N we encounter much more extreme 
values of a. 

We approximate Xy, with appropriate parameter choices, as a varies in the prac- 


tical range, all other things being equal. In fact we calculate 


a 
qa) = EA 
Do (watt = Ea) (i 5 rogata) (pila) + (1 = y) | 
Dio (a — ae) (1- EAM) (pfo(0)) + ED 


The quantity Q(a) approximates the relative increase in full yield we might expect as 
a decreases in the practical range. 


Note 4.1.1 In practice Q(a) is approximately independent of K, so we use K = 100 


in accordance with [6]. 
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We now insert typical polynomials and other parameters into the calculations. We 
use polynomials selected for factorisations of five integers C87, C97, C105, C106 and 
C107 in [33]. The polynomials used are the polynomials labelled f(x) in [33] for each 


integer. Other relevant parameters from [33] are shown in Table 4.1. 


divides | 72% + 1 
B 1.0. 10° 


ke < | 7.5. 102 
fm | ]3.0:107, 
7.5 - 1077] 


Table 4.1: Parameters for Table 4.2 


Values of Q(a) for these parameters approximate the range of relative yields we 
can expect due to root properties on typical polynomials in the above cases. Table 4.2 


contains Q(a) calculated at several a. 


La Foro ene o Er 


Table 4.2: Q(a) vs a 


The complete results on C107 for a € [-4,0] are shown in Figure 4.1 below. The 
complete results for the other parameters are similar. 

We see that, heuristically, we expect the difference in yield between polynomials 
with values of a at the extremes of the practical range to be as much as a factor of 


two. This is a significant difference. 


4.1.3 1LP-Yield as a Function of a 


Suppose we now have Bı and Bo, with Bı < B2, and consider the f-values that are Bı 
smooth but for the appearance of exactly one prime between Bı and B2. Let Yp be the 
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u 
T 
1 


Expected increase in full yield 
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L 
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Figure 4.1: Q(a) for C107 


number of these 1LP-smooth f values on T. Again, we approximate Yy by examining 
the 1LP-yield in intervals along T. Let tı(y;) be the number of 1LP-smooth f-values 
on I with y < yi. Clearly 


Yr = D_ (ti(yis1) — t1(ys))- (4.6) 
i=0 


In what follows we implicitly assume that if P,(n) is the largest prime factor of 
some integer n, then the prime factors of n/P, (n) are distributed like those of a random 
integer of size n/P,(n). In fact this is not true - see Section 2.2.1 and Aside 4.2.2 
following. However, the assumption suffices for our purposes. 

For each large prime p, let gip(a) = gi(a) — logp = log Sı + ih + a — log p, and 
Vip(@) = gip(a)/ log B. Then 


ta) tly) Y © (tyi/p) — tlyi/p)) 


Bi <p<B2 
q log gi,p(@ p(u; pla) — 1 
= ma D O) (ata ran) 
B¡<p<Ba P SA Jip 
ci  Cologlog By 
a ee Es 4.7 
«(+ log By | (4.7) 


which, with (4.6) gives an approximation to Y. 


We are interested in the relative increase in 1LP-yield as a function of a, that is, 
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the ratio 


(4.8) 


Calculating (4.8) directly is time-consuming. Since we are interested only in checking 
that practical changes in a can bring significant increases in yield, we instead obtain 
upper and lower bounds on (4.8) in intervals along f. The bounds suffice to show a 
significant increase in yield. For i = 1,... , K — 1 let Ypi(a) = tı (yi+1) — tı (yi) be the 
partial yield of f in the i-th interval only. We bound 


_ Yrila) 
Y7i(0) 


R;(a) 


fori=1...K-—1. 
Recall that A denotes the discriminant of f. Let 


LP = {p : p prime, Bı < p< Bə, (A/p) =1} 


be the set of large primes which may appear in the factor base and let p1, p2 be the 


minimum and maximum elements (respectively) in LP. Then 


gala) = logu, + ih +a — log pa, and 


9i2(0) log x; + ih + a — log pı 


are the minimum and maximum values (respectively) of g,p(a) on (£i, 0,41). Also, 


viila) = gi1(a)/log By, and 
gi2(a)/ log Bı 


vila) 


are the minimum and maximum values (respectively) of vip on (xi, t;+1). Finally, let 


tia) = = (1- EO | (paa) a POM |, 
Then 


Vela) < (uga — zi) |LP| Uila). 
Similarly, Yy¿(a) > (vi41 — vi) - |LP| : Li(a) . Since we are varying only a, 
(4.9) 


To calculate R;(a) we use the additional parameters from [33] given in Table 4.3. 
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60 66 | 0106 | 0107] 


10-10 | 23-108 | 27. 106 | 27.2. 108 
24.106 | 30-106 | 30.106 | 30-107 


Table 4.3: Large prime bounds 


Li(—4) Uu (4) 


0.64, 4.42 | 0.61,4.39 | 1.52,1.93 | 1.51,1.93 | 1.54,1.92 


0.69,4.98 | 0.65,4.82 | 1.62,2.05 | 1,62,2.06 | 1.62,2.05 
0.72,5.36 | 0.68,5.18 | 1.72,2.18 | 1.72,2.20 | 1.72,2.18 
0.75,5.74 | 0.71,5.53 | 1.81,2.29 | 1.83,2.32 | 1.81,2.30 
0.78,5.97 | 0.73,5.85 | 1.87,2.40 | 1.88,2.42 | 1.90,2.40 


Table 4.4: Upper and lower bounds on R;(—4) 


We give values of the bounds on R;(a) evaluated at a = —4, for several ¿, in Table 
4.4. 


The values for C87 and C97 are inconclusive, our bounds on the large primes 
appearing here are too crude for integers of this size. But the values for C105, C106 
and C107 (in particular the lower bounds) are useful. We illustrate in Figure 4.2 the 
complete results for C107. The results for C105 and C106 are similar. The region 
between the lines represents the expected increase in the 1LP-yield of f. 


We conclude that practical changes in a can also bring significant increases in 
1LP-yield. 


4.1.4 2LP-Yield as a Function of a 
Let Z; be the number of 2LP-smooth f values on T. Let t2(y;) be the number of 


2LP-smooth f-values on I with y < y;. Then 


Zs = > (talyir1) — tolys)). (4.10) 
1=0 


For the large prime pair {p,q} let gipala) = gila) — logp — logg and vipla) = 


Gi,pq(@)/ log By. Then, assuming again (which is not quite true) that the appearance 
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2.5- 7] 


Expected p-yield 
u 
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0.5 J 
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Figure 4.2: R;(—4) for C107 


of p and q in the factorisations of f-values is independent, 


to(yis1) tau) Y — (ua/pa) — tlus/pa)) 


(pajeLp P4 
4 log gi pala) p(vi pq(@) =) 
~ (ri41 — Zi) y sA (1 SS a ] alo) Ly) = 
{p.q}eLP pq log By Jipa la) 
ce . Caloglog By 
jek je aes SO Oe) 4.11 
x ( ASK Ra ) (4.11) 


Equation (4.10) and approximation (4.11) give an approximation to Zș. 
Again we present bounds on the relative increase in Zy in intervals along I, as a 
varies in the practical range. Let Zp; = to(yi41) — t2(yi) be the 2LP-yield of f in the 


i-th interval, and let 


Zila 
T;(a) = fal ) , 
Zq,(0) 
We calculate bounds on T; for i = 1,... ,K — 1 by repeating the calculations of the 
previous section. Thus, let p1, p2 and p3, p4 be the two least and two greatest elements 


(respectively) of LP. Let 


= |logz; + a — log pz — log pa, 
= log x; + a — log p¡ — log pz, 
= gi1la)/log Bi, and 

= gi2(a)/log By. 
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Then if 
4 log gi,2(a) ) ( plui2(a) — 2) 
Lila) =- | 1m v;2(a)) + (1- y => |, and 
(a) = >, (1 BEE ) (ptviala)) + (1-1) EE 
4 log gi,1(a) ) ( p(vi (a) — 2) 
Uila) = — | 1- == viila)) + (1— y) == 
A E ) (posila) + = 9) 
we have 
Lila) uila) 
T; : 4.12 
LO a 2 vr 
Table 4.5 contains values of the bounds on 7;(—4) given by (4.12), for several i. 


Li(—4) Ui(—4) 


0.25, 11.60 | 0.23,11.53 | 1.26, 2.02 | 1.26, 2.02 | 1.33, 2.07 


0.25,12.32 | 0.23,12.04 | 1.33,2.12 | 1.33,2.13 | 1.40,2.19 
0.26, 13.13 | 0.24,13.00 | 1.40,2.26 | 1.41,2.27 | 1.45, 2.28 
0.26, 13.68 | 0.24, 13.64 | 1.46,2.36 | 1.47,2.38 | 1.51, 2.38 
0.27, 14.36 | 0.25, 14.24 | 1.51,2.47 | 1.53,2.50 | 1.56, 2.46 


Table 4.5: Upper and lower bounds on T;(—4) 


The results for C87 and C97 are again inconclusive, whilst those for C105, C106 and 
C107 are useful. We conclude again that practical changes in a can bring significant 


increases in the 2LP-yield. 


4.1.5 In Practice: Sieving Experiments 


We now seek empirical verification that the parameter a(f) indeed captures the effect 
of root properties on yield. 

Differences in yield amongst polynomials fi and fo, due only to root properties, 
can be observed by examining the yield across regions where fı = fo. We chose five 
candidate polynomials, Polynomials A, B,... , Æ, for the 106 digit integer C106 given 
in Table 4.1. The polynomials are given in Appendix A. These particular polynomials 
were chosen because they exhibit a certain range of root properties. We sieved each 
polynomial B,... , E in intervals of size 10° centred on a point at which the polynomials 
take the same value as Polynomial A. Over the entire interval the “other” polynomial 
has the same size as Polynomial A to at least the fourth significant figure, and usually 
more. Any difference in yield between the polynomials over these intervals should 
therefore be due their different root properties. 

These polynomials are typical of polynomials produced by Montgomery's Two 


Quadratics method for line sieving on C106, except that their root properties are 
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fixed. In fact, Polynomials A,... ,E have a € [—2.56, 1.51] as shown Table 4.6. We 
used Bı = 2700000 and Bz = 30000000 in accordance with [33]. 


Table 4.6: a values for candidate polynomials. 


We summarize the results in Table 4.7 below. The relative yields shown are the 
yield of Polynomial A relative to the “other” polynomial, so for example the full yield 
of Polynomial A is 2.32 times that of Polynomial E. 


Polyn- | a(f)— | rel. total | rel. full | rel. 1LP | rel. 2LP 
omial f | a(4) yield yield yield yield 


Table 4.7: Relative yields due to root properties 


According to the calculations of Section 4.1.2 the increases in full yield of A should 
be approximately 1.24, 1.51, 1.86, 2.30 relative to polynomials B,... , E respectively. 
Moreover, the increases in 1LP and 2LP yields of Polynomial A relative to Polynomial 
E fall close to the middle of the bounds of Sections 4.1.3 and 4.1.4. 


The values taken by Polynomials C and D behave more like random integers than 
we expect on the basis of Section 4.1.2. Probably this is because in Section 4.1.2 we 
consider only changes in a, not the value itself. The values a(C) and a(D) are close 
to zero (—0.50 and 0.50 respectively). Hence we must expect their values to behave 


more like random integers than if their a values were —2 and —1 for example. 


We conclude that in the quadratic cases examined, differences in yield from root 
properties alone can indeed be as much as a factor of two. Root properties are therefore 


a factor which should be considered whilst modelling yield. 
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4.2 Modelling Yield 


Having established that a(f) seems to quantify well the effect of root properties, we 
now seek a simple model of yield. The model given here is used in subsequent chapters 
dealing with the problem of finding good polynomials. 

In Section 4.2.1 below we visualise the relevant features of yield. From sieving 
experiments we see increased yields at real roots of f, and the relative variation in 
yield away from real roots. We refer to the former as peak yield and the latter as 
yield across the region. Notice that whilst the peak yields we see in this section all 
correspond to real roots, in general the peak yield of a polynomial occurs where it takes 
minimal absolute value. We propose a simple method of estimating yield in Section 
4.2.2, and use it in Section 4.2.3 to predict both peak yield and yield across the region. 
To this point we have considered only quadratic polynomials with line sieving, but in 
Section 4.2.4 we extend our simple model to repeat some calculations of Section 4.1 


under conditions experienced in factorisations of large RSA keys. 


4.2.1 Actual Yield 


On each of the polynomials A,... , E we performed line sieving in short intervals along 
|z| < 10%, again with smoothness bounds Bı = 27000000 and By = 30000000. We 


014 along the sieve interval, and 


sieved in intervals of length 108 centred at steps of 1 
in intervals of 108 centred at each real root of each polynomial. 

For all polynomials the obvious feature of yield across the sieve region is the relative 
increase at real roots. This of course is due to the polynomials taking much smaller 
values close to roots. Common to all polynomials under the conditions we investigated 
is an increase in total yield by a factor of at least fifteen across roots. Polynomial A is 
typical, Figure 4.3 shows the relative increase at real roots of Polynomial A. 

During an entire sieve run, values of z close to real roots of f(x) are a richer supply 
of smooth f-values than those not. Of course this does not necessarily mean that we 
should blindly search for polynomials with as many real roots as possible. We see 
particularly in subsequent chapters that, leaving aside the question of root properties, 
the pervading requirement is that f-values be kept small over the sieve region. Real 
roots will help of course, but are not the sole determing factor. 

Most values of x in the sieve region are not close to real roots of f. The total yield 
away from real roots is not quite as flat as Figure 4.3 indicates. Figure 4.4 shows total 
yield across |z| < 101% just in steps of 1014 (that is, without explicitly showing the 
yield at real roots). 


Remark 4.2.1 Figure 4.4 suggests that, in relative terms, the yield of f varies greatly 
across the region. This has consequences for the collection of relations. Recall that 


a relation for the number field sieve in its full generality is a coprime integer pair 
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Figure 4.3: Total yield (with roots) of Polynomial A with |z| < 10% 


(a,b) at which both F,(a,b) and F (a,b) are smooth (or almost smooth). In this 
case, since we use line sieving, y = 1. So far we have considered only the yield of 
fi and fo individually. It is reasonable to assume that these yields are independent. 
That being the case, the likelihood of a given x = a causing both f1(a) and fo(a) to 
be simultaneously smooth will increase if the regions of maximal yield of fı and fo 
coincide. The same argument holds for sieving F1(x,y) and Fo(x,y). Particularly if 
one is using more than two non-linear polynomials (Remark 2.3.1 and [32]) current 
performance might be exceeded by considering the proximity of the real roots of Fy 


and F> when selecting polynomials. 


Recall that in dealing with large prime yields we are assuming that the appearance 
of each prime in the factorisation of a given integer is independent. This of course 
is not true, and next we observe the effect of the dependence. Since this is of little 


practical consequence we leave it as an aside. 


Aside 4.2.2 As before, let T' be the total yield, and Q, R, S be the full, 1LP and 2LP 
yields respectively. For all five polynomials the proportions Q/T and R/T increase 
close to real roots at the expense of S/T. For example, for Polynomial A the proportion 
Q/T increases from 10% to 18%, R/T increases from 38% to 44% and S/T decreases 
from 52% to 38%. For the other polynomials the proportions take similar values. 
Recall from Section 2.2.1 the generalisations po(u) and p3(u) of p(u). These func- 
tions describe the joint distributions for the two and three (respectively) largest prime 
factors of r as r — oo. We are interested in the special cases in which r has exactly 


one or exactly two prime factors at most Bo, but is otherwise B¡-smooth. Let p2(u, v) 
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Total yield 


Figure 4.4: Total yield (without roots) of Polynomial A with |z| < 101° 


be the former function and p3(u,v) be the latter function, with u = logr/ log By and 
v = logr/ log Bo. 

Using the methods of [3] and [45] to calculate these functions, we observe that in 
the range of interest 


OBB ODE: wd 0083 
Ov Ou Ov du ` 


Note that the inequality for pa(u, v) is not true for arbitrary u and v. Intuitively (4.13) 


(4.13) 


means that as r increases, the smoothness probabilities for 2LP-smoothness (and to 
a lesser extent 1LP-smoothness), depend more on r being B2-smooth than on the 
cofactor (with the large primes removed) being B;-smooth. That is, Bo-smoothness is 
the “difficult” property. The difference in (4.13) between p2 and p3 comes from 
Op2 _ Ops 
a a 
Intuitively, po ought to be more sensitive than p3 to changes in u because a B2-smooth 
integer with only one known prime factor between Bı and Bə is less likely to be 
otherwise B,-smooth than one of the same size with two known prime factors between 
Bı and Bə. 
Now, since Bı < Bo 
du dw 
dr ~ dr 
Ignoring for the moment the question of root properties, (4.13) and (4.14) imply that 
as |f(x)| decreases S/T ought to decrease relative to both Q/T and R/T, and that 
R/T ought to decrease slightly relative to Q/T. 


(4.14) 
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4.2.2 Estimating Yield 


Recall that with f(x) > 0 and 


_ log f(x) + a(f) 
7 log B 


we assume that 


Suppose J C Z is some sieve interval. Then 


x plus (a) — 1) 
Er Y o | (415) 
We use the right hand side of (4.15) to approximate the full yield of f across I. 

In practice |I| is large, so (4.15) is too time consuming to compute completely. 
Instead we approximate the summation by breaking J into K sub-intervals over which 
the right hand side of (4.15) does not change significantly. Let Ik be the interval I so 
divided, so Ik contains every |I|/K-th element of J. Hence, if X; again denotes the 
full yield of f across J, then 


p(ur(x) — 1) 


xy Y aa SEES 


xElk 


(4.16) 


4.2.3 Examples 


We now examine estimate (4.16) in the context of both peak yield and yield across the 


region. In the calculations below we use K = 10°. 


Peak Yields 


We tested estimate (4.16) for X; on seven polynomials with a-values sufficiently low 
to be acceptable number field sieve polynomials. In particular, we used Polynomial 4, 
and six other polynomials F,G,... ,K. Polynomials F,... ,K are polynomials used 
to factorise 105, 106 and 107 digit integers in [33]. Details of each are in Appendix A. 

We calculated estimate (4.16) in an interval of size 10° across one real root of 
each polynomial, and sieved the polynomial across the same root. Yields across the 
two roots of each polynomial are almost identical so the choice of root is arbitrary. 
We used B = 1600000 for polynomials F and G in accordance with [33], otherwise 
B = 2700000. Table 4.8 contains the results for full relations. 

The estimate places only one polynomial, J, in the incorrect position, and has an 


average relative error of 5.9% (most of which is contributed by polynomials J and F). 
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Poly- | Est. full | Full Relative 
nomial | yield yield | error (%) 
K ; 


Table 4.8: Estimated vs actual full yield 


Yield Across the Sieving Region 


Table 4.8 tests only the peak yields of the polynomials. We saw at Remark 4.2.1 that 
it is also of interest to note how estimate (4.16) changes across an entire sieve interval. 
In Figure 4.5 below we show estimate (4.16) across the entire |z| < 10% interval, at 
uniformly spaced sub-intervals, for Polynomial A. We also show estimate (4.16) at 
a = 0, that is, the expected yield if values taken by Polynomial A are as likely to be 


smooth as random integers of the same size. This is much lower than the actual yield. 
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Figure 4.5: Estimated and actual yield of Polynomial A with |x| < 105 


We conclude that the approach described in Section 4.2.2 to estimating yield is 
useful. 
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4.2.4 Polynomials for Larger N 


We now extend the approach in (4.16) to consider yield due to root properties in the 
context of larger N. It is not clear in advance that we should expect differences in 
yield similar to those exhibited on quadratic polynomials. Not only are the integers 
which are required to be smooth larger, but B also is larger, and the practical range 
of a values is different. 

The non-linear polynomial considered is now the homogeneous polynomial F(z, y), 
so its sieve region lies properly in the x, y-plane. In fact as we see in the next chapter, 
the region is usually a rectangle much longer (x direction) than it is wide (y direction). 
We denote the length to width ratio s, and in this section will consider a fixed sub- 
rectangle S of the entire sieve region which also has length to width ratio s. 

Again we let Xp denote the full yield of F over S. This yield depends on a(F’), 
and in this section the aim is to compute 


= X F(a) 
—_Xp(0) 


Q(a) 


To compute Q(a) we divide S into K equally sized sub-rectangles labelled S; for 
i=1,...,K. Let F; be the mean value of F(x,y) across S;. Putting 


_ log F; + a(F) 


wa log B 


we obtain for the probability, depending on a, that F; is B-smooth 


pulo) = 1) 


P,(F,B) ~ p(u;(a)) + (1 — y) log F; 


Assume the distribution of coprime integer pairs (x,y) is uniform throughout S. That, 


and the fact that all S; have the same area imply that 


= BB P,(Fi, B) 


oe Ni Po(Fi, B) 


(4.17) 
Here we calculate (4.17) using parameters from the factorization of RSA-140. The 


non-linear polynomial used in the factorisation is 


Fi(z,y) = 439682082840 x° 
+390315678538960 zty 
—7387325293892994572 x? y? 
—19027153243742988714824 a? y? 
—63441025694464617913930613 zy? 
+318553917071474350392223507494 y°. 


We examine this polynomial in more detail in the next two chapters. For now all that 
is relevant is that the sieve region that best fits Fı has s ~ 4000. For the calculations 


shown below we used a rectangle S of area 10% with s = 4000, centred on the y-axis 
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at y = 5000. This places S in a typical portion of the entire sieve region. We used 
K = 10° and B = 274 — 1 (see Section 6.2). 

The practical range for a we take to be [—7,0]. This is much more extreme than 
in the case of the quadratic polynomials of the previous sections. With the degree d as 
high as d = 5 we find polynomials with much better root properties than when d = 2. 
This is due partly to higher degree polynomials having the capacity to have more roots 
for each prime p > d, but mainly to extra tricks we have which rely on d being at least 
four. We explain these tricks in the next chapter. 

For now we use F} only as a source F-values typical of those required to be smooth 
for factorisations of large N. Figure 4.6 below shows Q(a) computed using (4.17) on 
F, with a in the practical range. 


4.5 


Expected increase in full yield 


-alpha 


Figure 4.6: Full yield due to root properties at N = RSA-140 


4.3 Summary 


In this chapter we have investigated the yield of number field sieve polynomials. We 
take yield to be influenced by two factors, size and root properties. The effect of root 
properties on yield has not previously been well understood. Here we demonstrate that, 
for example, under the conditions of large RSA moduli factorisations root properties 
can influence yield by up to a factor of 4. 

Since it is therefore clear that root properties ought to be taken into account when 
modelling yield, we give a simple estimator of yield which combines both root properties 
and size. We then compare estimated yields to actual yields. Our estimate quantifies 
yield within an accuracy sufficient for our purposes, both absolutely and relatively. 


We are hence led to an understanding of yield: what should be sought is a good 
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combination of size and root properties. We turn now to the problem of finding poly- 


nomials with such combinations. 
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Chapter 5 


Finding Good Polynomials 


In this chapter and the next we employ our understanding of polynomial yield in 
selecting polynomials for factorisations of large general integers. This chapter contains 
descriptions of the relevant computations, and the next chapter contains the examples 
RSA-130, RSA-140 and RSA-155. 

We consider the selection problem in two stages. In the first stage, we generate a 
large sample of good polynomials. This process is described in Section 5.1. Thousands 
of polynomials survive this stage, so sieving experiments are still impracticable. How- 
ever there remains significant variation in yield across this sample. Thus in the second 
stage we identify, without sieving, the best polynomials in the sample. This process is 
described in Section 5.2. Section 5.3 contains a summary. 

Throughout this chapter we distinguish two types of polynomial, namely non- 
skewed and skewed. The traditional approach to polynomial selection for RSA factori- 
sations is to search for polynomials all of whose coefficients are small, without regard 
to root properties. All coefficients being small endears polynomials to sieving regions 
—U < x < U and 1 < y < U, for some integer U. We refer to these polynomials 
as non-skewed. In this chapter we extend this approach by giving simple methods of 
finding non-skewed polynomials with good combinations of size and root properties. 
Section 5.1.1 describes generation of many good non-skewed polynomials, and Section 
5.2.1 describes identification of the best ones. We demonstrate the strength of even 
these simple methods in the next chapter by repeating the polynomial selection for the 
RSA-130 factorisation. 

The non-skewed case is also a useful introduction to more complicated methods 
of finding good skewed polynomials. In the case of skewed polynomials, we require 
only some of the coefficients to be small. The coefficients aq, ag-1 and ag-2 will be 
particularly small, and usually the coefficients will increase in absolute value from ag 
through ag. The natural sieving region for skewed polynomials is a rectangle S whose 
length (x-direction) to width ratio is s, with s > 1. We fit a different S to each 
polynomial. In practice we encounter s values up to approximately 10°. Indeed, we 


go to some effort to construct highly skewed polynomials. There are implementation 


TT 
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specific reasons for seeking such polynomials. More importantly though we are able to 
introduce additional techniques to find highly skewed polynomials with excellent root 
properties. 

We describe the generation of such polynomials in Section 5.1.2. Isolating the best 
skewed polynomials requires only simple adjustments to the procedure for non-skewed 
polynomials, and we discuss this in Section 5.2.2. We demonstrate methods for finding 
skewed polynomials in the next Chapter by describing the polynomial selection for the 
factorisations of RSA-140 and RSA-155. 


5.1 Generating Good Polynomials 


Recall from Chapter 2 that for integers of the size under consideration, the base-m 
method is the best method we have of choosing polynomials. So d is fixed and we seek 


m z NV) with a polynomial f of degree d for which 
f(m) =0 mod N. (5.1) 


Sieving occurs over the polynomials Fi (a, y) = y? f(x/y) and Fo(a,y) = x — my. 
As we have seen before, the polynomial f descends from the base-m representation 
of N. Let the coefficients of this expansion be a”. That is, 


1 


) 


with 0 < al™) < m. Heuristically it is sensible to adjust the a” 
(m) (m) 


to lie between —m/2 


and m/2. In fact, if al > |m/2] then we replace a; * with a; ” — m and al) with 
af") +1. Let 
N 
JmiDLE y ají 
i=0 


be the polynomial whose coefficients are the al) reduced in this way, working from 
1=0,... ,d through the coefficients. 

The exercise now is to choose m and fm (or some variant thereof which preserves 
(5.1)) with good combinations of size and root properties. In the case of non-skewed 
polynomials, we consider only fm. In the case of skewed polynomials, we have the 


freedom to explore many variants of fm. 


5.1.1 Non-skewed Polynomials 


Here we give simple methods for choosing good non-skewed f,,. Even these simple 


methods suffice to give significant improvements over previous factorisation efforts. 
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We consider first the problem of generating fm which are “small”, then of generating 
fm with better than average root properties. 


What does it mean for fm to be small? 


Definition 5.1.1 A base-m representation fm is x-small when y is the largest value 
of |a;|/m fori=1,...,d—1. 


We refer to these simply as small base-m representations if the value of x is not 
material. 

A necessary condition on a particular representation being small is that the coeffi- 
cients ag and ag_1 are small. By the choice of m it is easy to ensure that ag is small. 
Our search simply employs the fact that for small ag, small al”) occur only when m 


is close to a value at which ag changes. 


Example 5.1.2 Let N = 9999399973 = NextPrime(10°) x PreviousPrime(10°). The 
following is the sequence of (unreduced) base-m representations of N around the value 
of m which forces az to decrease from 9 to 8. The coefficients are listed as [ag,... , a3]. 


Notice the (almost) linear change in az. 


[ 323, 405, 155, 9 ] 
[ 64, 122, 128, 9] 
[ 61, 925, 100, 9 ] 
[ 260, 751, 73, 9 ] 
[ 607, 631, 46, 9 ] 
[ 13, 566, 19, 9] 


[ 493, 554, 1028, 8 ] 

[ 959, 596, 1002, 8 ] 

[ 319, 693, 976, 8 ] 

[ 594, 843, 950, 8 ] 
[ 693, 7, 925, 8] 

[ 562, 264, 899, 8 ] 

[ 147, 575, 873, 8 ] 


Table 5.1: A sequence of base-m representations 


(m+k) 


of N, as functions of a™ and k. The coefficients are related by the fact that 


i 


For fixed m, we now consider the coefficients a of the base-(m + k) expansion 


d 
So al mi = V al) (m + k)' = N. 
i=0 i=0 
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d „(m 


Matching the coefficients of the polynomials > 7;_¿ a; x and Sia ath gi reveals 
that 
k E J a 
arr? d= as” (, E ) (=k)" mod (m + k) (52) 
j=i 
fori=1,...,d—1. For ag_1 this means that 
ae) = (a, — dkay” ) mod (m + k). (5.3) 


The value of m which causes the leading coefficient to decrease from ag to ag — 1 


is given by 


lay] < xm (5.4) 


are easily determined. Moreover, the proportion of m-values satisfying (5.4) is approx- 
imately ży . Hence, compared to choosing m at random, conditioning the search on 
m guaranteed to give |ag_1| < Xm increases its efficiency by a factor of approximately 
1/2x . Typically we use x = 0.02, so the search efficiency increases by a factor of 25. 

This method does not give much information on the location of m for which small 
values of lower order coefficients must lie. For example, the third coefficient of a quintic 


representation is 
GT = (10072 = 4a” k + as”) mod (m + k), 


which in practice means that the change in af th) 


as a function of k is no longer 
sufficiently small to be useful. 

Consider now the problem of generating non-skewed polynomials with better than 
average root properties. Recall that we regard F\(x,y) as having two types of roots 
modulo p, projective and non-projective. Here we equip Fı with better than average 
root properties by forcing it to have good projective roots modulo small p*. The 
appearance of good non-projective roots at this stage we leave to chance. 

The following example illustrates the effect of this observation on the distribution 


of root properties amongst polynomials examined. 


Example 5.1.3 Let N = RSA-140. A reasonable range of leading coefficients of non- 
skewed fm for N is [102-3, 1021-3]. We choose ag in this range to contain a cofactor c 


in each of the following five cases: 


1. the worst case, ag is prime, 
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2. the average case, ag is chosen uniformly at random, 

3. a good case, cjag with c=2-3-5-7, 

4. a better case, c= 2-3-5-7-11-13-17-19- 23-29 = 10%, and 

5. an even better case, c = 25 . 34 . 53 . 78 . 11? -13 - 17 - 19 - 23 - 29 = 10166 . 


Randomly chosen samples of 100 polynomials in each of the five cases above gave 


the following a values. 


Table 5.2: Polynomials with many small projective roots 


Despite c being large in case 5, it is still sufficiently small to allow examination of 


many polynomials. 


Note that computing the ideal decomposition for ideals corresponding to projective 
roots requires more effort than those corresponding to non-projective roots. Hence 
polynomials found by this method will require marginally more effort in the square 
root stage, but the benefit far outweighs this extra cost. 


Our procedure for finding non-skewed polynomials is the following. 


Procedure 5.1.4 (Non-skewed Base-m Polynomials) 1. Fix an interval ag in 
which each ay is significantly smaller than its corresponding m. In fact, select 
X1,X2 for which ag will satisfy xı < |ag|/m < x2. The interval for ag is then 
bounded below and above by 

(dlog x; + log N) 
d+ 1 


at j = 1,2 respectively. This corresponds to a range of m values bounded below 


log ag = 


and above by 


(log N — log xj) 
| = R 
ogm ET 
at j = 2,1 respectively. 


2. Fix a cofactor c of ag, with c a product of many small p*. Of course several c will 
be used. For each ag divisible by c in the interval described in Step 1, determine 


from (5.3) the values of m for which |aqg_1/m| < xm (with x > x2). 
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3. For each m identified in Step 2, check the remaining coefficients of fm. If fm 
is x-small, compute an approximation to a(F1), and if that is also sufficiently 


small, output fm. 


5.1.2 Skewed Polynomials 


We turn to our more involved procedure for finding good skewed polynomials. It is 
not just skewed, but highly skewed polynomials that are the real target of this section, 
because we have a procedure for finding descendants of highly skewed F; with excellent 


root properties. The following example illustrates a highly skewed target. 


Example 5.1.5 Let N = RSA-140. We give two pairs of polynomials for N. The 
pair (F1, F>) is the pair of skewed polynomials used for the factorisation of RSA-140. 
The pair (G1, Go) is the best non-skewed pair identified during the search for RSA-140 
polynomials. 


Fi(2,y) = 439682082840 x° 
+390315678538960 x4 y 
—7387325293892994572 x? y? 
—19027153243742988714824 xy? 
—63441025694464617913930613 zy? 
+318553917071474350392223507494 y? 


Fo(x,y) = x — 34435657809242536951779007 y 


Gi(a,y) =  237866611103421300000 25 
—514856715582822510304 x4 y 
—4722668925346720843884 x3 y? 
+6545365626333869758617 xz? y? 
—3356924353646091366162 zy? 
—5142225622472630020004 y? 


Galx,y) = «x — 617119742304446938751913 y. 


The recommended sieve rectangle for Fi has s = 4096. Both F; and G; are 
unusually small over their respective regions (x(G1) = 0.011). Both also have good 
root properties, but the main difference between the polynomials is in just how good 
their root properties are. We have a(F1) = —7.0 and a(G1) = —4.2. 


How does this difference in root properties come about? Notice that |a| and |ag| 
of F; are larger than m. Clearly on construction of fm they cannot start that way. 
In fact we adjust fm to cause it to appear highly skewed, compensating for large low 


order (in x) coefficients by skewing the sieve region. Once the low order coefficients 
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are large we try many further adustments hoping for some which lead to polynomials 
with excellent root properties. 
During the skewing process f may move “off-centre” along the z-axis. We also 


make an adjustment that fixes this. Hence, we adjust f in two ways: 


e Translation by t: fi(x) = f(x — t) for some t € Z. This leaves root properties 


unaltered but can improve size. With m, = m + t we preserve (5.1). 


e “Rotation” by P: fp(x) = f(x) + P(x): (1 — m) for some polynomial P whose 
degree is small compared to d. This preserves (5.1). Rotation by P can alter 
both size and root properties. Presently we use only linear P, but for higher 
degree f, polynomials, higher degree P could be used without impinging on the 
high order coefficients of f. 


Translation by t need not be peculiar to skewed polynomials, but we make more 
use of it here than with non-skewed polynomials. Most of the benefit comes from 
rotations by well chosen P. Indeed we use two rotation steps. The first is aimed at 
producing highly skewed f which are unusually small over some skewed rectangle. The 
second is aimed at taking these f and rotating them to form new ones which, whilst 
retaining desirable size properties of the old ones, also have excellent root properties. 
Our implementation of this procedure is due mainly to Peter Montgomery. 

We describe this procedure in four steps. In Steps 1 and 2 we isolate skewed 
polynomials which are unusually small over some rectangle. Polynomials surviving 
Step 2 enter Step 3. In Step 3 we seek rotations giving polynomials with excellent root 
properties without destroying the good size properties inherited from Step 2. Step 4 
produces the output. 


Procedure 5.1.6 (Skewed Base-m Polynomials) 1. Find leading coefficients ag 
divisible by many small p* for which there exists a base m expansion with skewed 


coefficients. For each such ag we examine 


Check the magnitude of ag_;, and of ag_2 compared to m, by computing the 


integral and non-integral parts of 


N —aam Ad—2 -2 
= = Qd- — +O : 
mal Ad—1 + pa + (m ) 


If these are sufficiently small, accept ag and m. 
2. Compute some initial adjustments to fm aimed at skewing it further and reducing 


its size over a new skewed rectangle. In particular, we consider variables c1, co, t 


and s, the adjustments 
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e translation by t, 


e rotation by P(x) =c,1 — ©, 


and the rectangle with |x| < ys and |y| < 1/,/s. Call this rectangle S, and 
denote by the subscript t variables which have been translated by t. Now let 


Fx) = fmlzi) + (cix: + co) (xt — me), and 
yf (a/y). 


At this stage we treat c1, cy and t as real variables. We now apply a multi-variable 


y 
— 
S 
= 

| 


minimization procedure to minimize 


fI. F? (x, y) dedy 


with respect to cy,co,t and s. The optimal values of c1, cy and t are rounded to 


integers, and s is recomputed. The average log size I(F,S) over the new S is 


I(F,8) = s J i F2(a, y) m 


and if that is sufficiently small, we proceed to Step 3. 


estimated, with 


. Search for polynomials with excellent root properties amongst polynomials with 


similar size properties to f. Let ją, jo be integers with |j;| < Jı and |jo| < Jo. 
Typically we have Jı < Jo. We investigate the polynomials 


Fin.io(@) = fu) + (jax — jo) (£ — m) 


using a sieve-like procedure to identify jı, jo pairs which ensure f;, ¡y has good 


root properties. 


We describe the sieve-like procedure. For each small prime p we consider contri- 
butions of p* for k > 1. Take p* and jı to be fixed, jo and 1 to be variable. The 
values fj, ;,(/) mod p! can be computed quickly for successive l = 0...p* — 1 by 
finite differences. For each such l we find, simply by solving a linear congruence, 
Jo € Żyw for which 


Fino = 0 mod p”. (5.5) 


For each solution jo of (5.5) we estimate cont,» (Fj ;,), and in an array of length 
p* record cont: (Fj jo) in the position corresponding to jo. We also record 
contyk (Pj, jo) at any projective roots. On completion mod p“, this array is repli- 


cated throughout the entire Jo-space. 


Once this is completed for all small p and all j4, the values in the 71,79 array 


approximate a(F;, jo) over the small primes considered. 
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4. Since I(Fj jo, 9) = I(F, S) we already have an approximation to the average size 


of Fi, jo; So we take the initial rating of each Fj, ¡ to be 


1,J0 
TF, jo» S) + a Fy, jo) 


For those F}, jọ whose initial rating is sufficiently low, we compute the coefficients 
of 


fle) +(e = jo)(z — m), 


and if it helps we compute translation of m and a new optimal value s for the 
translated F. 


5.2 Isolating the Best Polynomials 


From the procedures of the previous section we inherit a collection of many good 
polynomials. This collection may contain thousands of polynomials. Despite the fact 
that these are all good polynomials, there is still significant variation between best 
and worst yields in the collection. Indeed, the variation may exceed a factor of 50%. 
Since there are still too many polynomials to conduct sieving experiments, we require 
a fast and reliable procedure for rating the polynomials according to their yield, and 
therefore identifying the best ones. 

Below we outline this procedure. For simplicity of explanation we consider the 


non-skewed case first. The skewed case is then a simple generalisation. 


5.2.1 Non-skewed Polynomials 


Consider first only the non-linear polynomial 71(x,y). Often with non-skewed poly- 
nomials, all considered values of m are similar, so the rating is determined mainly by 
Fi. Later we make trivial adjustments to consider F> also. 

To ensure reliability of the rating it is crucial to have an accurate estimate of a(F). 
Hence, at this point we compute cont,(F,) for small p directly. That is, from a sample 
of F¡-values we count appearances of p for k > 1 and take cont,(F1) to be the mean 
number of appearances per value. This slows the procedure somewhat, but since small 
p” make a large difference to yield and since usually some small p are not well behaved, 
the extra computation is important. In fact we compute cont,(F1) directly for p < 100, 
and estimate cont„(F1) for 100 < p < 2000. 


Now, since F1 is homogeneous, in polar coordinates 
Fi (x,y) = r*F; (cos 6, sin 6). 


At fixed 0 = 6; any two polynomials of degree d grow as the d-th power of r along 
6;. So the values F1(cos6;,sin0;) are the most relevant for rating the yields of these 


polynomials. 
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Hence we fix r = 1 and put 


log |F; (cos 6;, sin0;)| + a( fF 
ur (bi) = SL TE rE) loz B JR 1), 


We divide the interval [0,7] uniformly into K sub-intervals and put 


for i = 1,... ,K. That is, 0; is the mean value of 9 on the i-th sub-interval. Now we 
compute 
K 
E(Fi) = $ olur, (8:)) (5.6) 
i=1 


and take E(71) to be an estimated rating of Fi. That is, polynomials are ranked in 


descending order of E(F,) values. The value of K is not crucial to the comparison 


between polynomials, but we use K = 1000. 


Now consider also the linear polynomial F>(x,y). Smoothness bounds for F; and 


F> may be different, so we denote by Bp, the smoothness bound for Fj. With 


log | F; sanë a 
pay Cl) 


log Bp, 
for j = 1,2 we take E(F}, F2) defined by 
K 
E(Fi, Fo) = X plur (6;))plur, (0;)) (5.7) 
i=1 


to be the estimated rating of a given pair of polynomials. That is, pairs of polynomials 


are ranked in descending order of E(F\, F2) values. 


Note 5.2.1 The values E(F ) for single polynomials should be compared only between 


polynomials F; of the same degree, and E(F, F2) values for pairs of polynomials should 


be compared only between pairs of polynomials which are pairwise of the same degree. 


Note 5.2.2 We observe the ranking induced by E to be independent of variations 


in B. Amongst all polynomials this is not necessarily true, but amongst a set of 
candidate polynomials in practice we expect this to be the case. Hence for example, 
using sub-optimal smoothness bounds should not change the fact that a polynomial is 


“sood”. 
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5.2.2 Skewed Polynomials 


Here we generalise the previous computation to give a fair profile of F; and Fo across 
a skewed region whose length to width ratio is s. Note that since values of m amongst 
polynomials may now differ substantially, both F and Fə should be considered at all 
times. 

Consider F; and F around an ellipse of fixed area, whose major and minor axes 


are in the ratio s. In particular let s; = ys and s2 = 1/,/s and consider the ellipse 


x sı cos 0 


y = s sind 


for 0 € [0,7]. We again divide the 0 interval uniformly into K equal sub-intervals and 


take 0; for i =1,... , K to be the mean value of 0 in each sub-interval. 
With 
log |F;(s1 cos 6;, s2 sin 6;)| + a(F;) 
Wp | 
log Bp, 
for j = 1,2 we take E(F1, F2) defined by 
K 
E(F,, Fa) = ) plur (6;))p(ur, (6;)) (5.8) 

i=1 


to be the rating of a given pair of polynomials. That is, pairs of polynomials are ranked 


in descending order of E(F1, F2) values. 


For any given N, at most approximately twenty polynomial pairs with highest E 


L 


ratings will then be subjected to short sieving experiments. 


5.3 Summary 


In this chapter we have described methods for finding good base-m polynomials. We 
consider both skewed and non-skewed cases. In each case we consider two problems. 
The first is the problem of generating large samples of polynomials which are small 
and have good root properties. In the case of non-skewed polynomials, we look only 
amongst polynomials whose first two coefficients are known to be small, and which 
have many projective roots modulo small p*. We leave the appearance of many non- 
projective roots modulo small p* to chance. In the case of skewed polynomials, we look 
only amongst polynomials with skewed coefficients and unusually small average size 
over some skewed rectangle, and with many projective roots mod small p*. Moreover, 
we make use of the fact that the last few coefficients of highly skewed polynomials 
are large, with an efficient method of isolating polynomials which also have good non- 


projective roots modulo small p*. 
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Having generated a large sample of good polynomials the second problem becomes 
isolating, without sieving, the best polynomials in the sample. In both skewed and 
non-skewed cases we do this by profiling the smoothness probability of each pair F1, Fo 


(adjusted for root properties), across the appropriate sieve region. 


Chapter 6 


Polynomials for RSA 
Factorisations 


This chapter is a report on polynomial selection for several large RSA factorisations. 
We use the factorisations to exemplify and investigate the techniques described in 
previous chapters. 

The two new factorisations discussed here are RSA-140 and RSA-155. Recall from 
Section 1.2.4 the details of the RSA Factorisation Challenge. The factorisation of 
RSA-140, completed in February 1999 set a new general factorisation record. At the 
time of writing this thesis, sieving for the factorisation of RSA-155 is complete. We 
expect the new record to be announced by September 1999. In this chapter we also 
re-consider the previous record set in 1996, RSA-130, as a means of testing some of 
the procedures of previous chapters. 

Throughout this chapter we examine polynomials from two perspectives, namely 
local and global. The local perspective involves comparing individual polynomials and 
their properties. The global perspective involves placing polynomials in the context 
of the space of available polynomials - for example by asking questions like “how do 
the yields of the polynomials we now find compare to the yields of randomly chosen 
polynomials?”. We take both local and global perspectives on polynomials examined 
for all three integers RSA-130, RSA-140 and RSA-155, however we emphasize the local 
perspective for the first two and the global perspective for the last one. 

In Section 6.1 we examine polynomial selection for RSA-130. We seek polynomials 
which, under the conditions used for that factorisation and reported in [23], would 
improve the sieving time. Working under the conditions used in the factorisation 


means that we should consider only non-skewed polynomials. Hence we test Procedure 


5.1.4 and the E rating procedure of Section 5.2. Even using only these techniques, the 


improvement obtained in a comparatively tiny period of time is surprising. Ultimately 
we find that in a fraction of the time used for the actual RSA-130 polynomial search 
we identify several polynomials whose full yields are 1.5-2 times that of the polynomial 


used in the factorisation. We will consider briefly a global comparison, by comparing 
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the yield of the best polynomial so obtained to that of a polynomial of average yield 
for RSA-130. 

In Section 6.2 we turn to RSA-140. When actually conducting this search, we first 
considered using non-skewed polynomials found by Procedure 5.1.4. Once it was clear 
we could obtain polynomials with better root properties and good size by Procedure 
5.1.6, we decided to use the skewed polynomials. We use the RSA-140 factorisation to 
examine locally the properties of the best skewed polynomials, and to compare locally 
the best skewed and non-skewed polynomials. Since the procedure to find the skewed 
polynomials was under development during the search, we do not have a sufficiently 
large sample of skewed polynomials for detailed global considerations, so we consider 
only the comparison of best skewed polynomials to average skewed polynomials. 

Given the results achieved with RSA-140, we searched only amongst highly skewed 
polynomials for RSA-155. The results are discussed in Section 6.3. We conducted a 
more comprehensive search, indeed several users ran the search program. It is timely 
to mention that we are particularly grateful to Arjen Lenstra for porting the search 
code to use his multiple precision arithmetic package LIP. That allowed other users 
to run it. Bruce Dodson ran several search jobs for RSA-155 polynomials, and the 
polynomial chosen for the factorisation appeared from one of his searches. 

Since we therefore have a large and “stable” sample of good polynomials for RSA- 
155, we are able to make a more detailed global examination of the polynomials we 
find using Procedure 5.1.6. We do this in Section 6.3, as well as examining the top few 


polynomials locally. The global comparsions we make are aimed at 


e placing the sample of polynomials generated during the search in the context of 


randomly generated polynomials, and 


e examining the trade-off between polynomial search time and the corresponding 


saving in sieving time. 


Section 6.4 contains a summary of this chapter. 
All polynomials which are not given explicitly in the text of this Chapter are given 
in Appendix B. 


6.1 RSA-130 


By re-examining the polynomial selection task for the factorisation of RSA-130 we 
aim to test Procedure 5.1.4 for finding good non-skewed polynomials, and the ratings 
E(F\, Fo) and E(71) of Section 5.2. We discuss three sets of polynomials, P;, Qi, and 
R;. The P; polynomials are the actual candidates discussed in [23], the Q; are a better 


set of candidates we generated, and the R; are the best candidates we generated. 
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6.1.1 The Fifteen Candidate Polynomials 


The paper [23] describes a set of fifteen candidate polynomials considered for the 
factorisation of RSA-130. These fifteen polynomials were generated over some time, 
and identified as being good candidates on the basis of a rating measuring the size of 
the values taken by the polynomials [37]. The polynomials are labelled 1-15 according 
to this initial ranking, Polynomial 1 being the best ranked polynomial and Polynomial 
15 the worst. According to [23] all fifteen candidates were then subjected to extensive 
sieving experiments and ranked according to their “true” yield as revealed by those 
experiments. The sieving experiments measured the yield of each pair of polynomials 
Fi, Fy. The two left-most columns (reproduced from [23]) of Table 6.1 show the rank 
and relative yields according to these experiments (we will refer later to the two right- 


most columns of this table). 


[23] yield (%) | Fy yield (%) 


Table 6.1: Sieving the RSA-130 polynomials 


We refer to Polynomials 1,... ,15 as P,,... , Pis. Polynomial P,4 was selected 
for use in the factorisation of RSA-130, and it becomes the polynomial we use as a 
benchmark for the polynomials we find. 

Since they were selected on the basis of their size, the fifteen candidate polynomials 
all have unusually small coefficients. The largest value of x for any of these polynomials 
is y = 0.004. However, they have a generally poor set of root properties. Table 6.2 
gives the values a(71) for Py,... , Pis. 


Most of the candidate polynomials have a(F1) > 0, so for these polynomials sieving 
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Table 6.2: Root properties of the RSA-130 polynomials 


is being conducted over integers less likely to be smooth than random integers of the 
same size. The reason P4 performs much better in sieving experiments than expected 
on the basis of its size alone, is that a(Pi4) is signficantly smaller than for the other 
polynomials. Actually a(Pj4) < 0 not because Pj4 has many non-projective roots 
modulo small primes, but because the leading coefficient of Pj4 has many small prime 


factors. Indeed 
as = 5748302248738405200 = 24.34.52. 19? . 331 - 114213131. 


That is, Pj4 has many projective roots for small p. 


6.1.2 On the Reliability of E 


We now use the fifteen polynomials from [23] and more of our own to test the reliability 


of E as a pre-sieving yield rating procedure. 


Remark 6.1.1 Clearly the initial ranking of [23] based on size alone is inaccurate. 
Another method is mentioned without detail in [23] as being devised after the factori- 
sation and giving reasonable correlation with the true rank for that set of polynomials. 
That method is due to Peter Montgomery and involves estimating the root mean square 
of each polynomial in some region and subtracting an estimate of the contribution of 


the small primes to that value. Indeed, we use an estimate similar to this as a first 


filter in Procedure 5.1.6, and we will see that again in Section 6.2. Our E method can 


be viewed as an extension of this method. The key differences are that we make a 
more accurate assessment of the effect of root properties, and we use the p function to 
emphasize regions of high smoothness probability where the polynomial takes smaller 


values. 
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Table 6.3 gives the rankings on the fifteen candidate polynomials of [23] under our 


E(F, F2) ranking and under application of Montgomery’s method of Remark 6.1.1. 


Actual Yield [23] rank E rank 


Polys Rank | Polys Rank | Polys Rank 
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Table 6.3: Rankings of the 15 candidate RSA-130 polynomials, E(F,, Fa) 


The first column gives the true ranking revealed by the sieving experiments in [23]. 
Its left sub-column contains the polynomial labels, and its right sub-column contains 
the rank. So P¡4 is ranked first, and Pg is ranked fifteenth. The second column gives 
the ranking induced by the method mentioned in [23]. Its left sub-column lists the 
polynomial labels in the order in which they are ranked, and its right sub-column gives 
the true rank of each polynomial. So this method ranks Py in first position (but P,’s 


true yield ranks second), Pj4 in second position (but Pj4's true yield ranks first), and 


so on. The third column gives the ranking induced by E, listed in the same fashion as 


the second column. 


One measure of the reliability of a ranking is its correlation coefficient with the true 
ranking, as defined in [23]. The correlation with the true rank for the method of [23] is 
r = 0.86, and for the E method r = 0.91. That is not a dramatic improvement, but the 


E ranking does seem to be more successful as a predictor, in the sense that it identifies 


better the very best polynomials. The trade-off of course is that our method is slightly 


more time consuming to compute. Indeed as indicated in Section 5.2 a method similar 


to that described in [23] is used before E to screen out the very worst polynomials. 


94 Chapter 6: Polynomials for RSA Factorisations 


6.1.3 Eighteen Different Candidates 


We now introduce a new set of candidate polynomials labelled Q1,... ,Q18. These 


polynomials were generated using only the observation of Section 5.1.1 which isolates 


polynomials with small ag and ag_1, then screening by E. We did not force these 
polynomials to have highly smooth ag; root properties were left entirely to chance. 


The purpose of exhibiting these polynomials is two-fold: 


e to show that even leaving root properties to chance, significant improvements can 
be made provided we know what to look for (that is, having a reliable pre-sieving 


rating procedure) 


~ 


e to give another set of test polynomials on which the E rating ought to work 


reliably. 


The relative yields of the polynomials Q; were determined by the sieving experi- 


ments described in the following remark. 


Remark 6.1.2 We sieved across the entirety of the rectangle —104 < z < 10% and 
1 < y < 10%, using only the quintic polynomial F}. Omitting the linear polynomial 
makes the experiments much quicker, and does not significantly affect the outcome 
since all values of m used are similar. Furthermore, in the first instance we sieved 
only for full relations (B = 11380951). To verify the reliability of these smaller and 
restricted sieving experiments, we also sieved each of the quintic candidate polynomials 
P,,..., Pis in this way. The results form the right hand columns of Table 6.1. The 
correlation coefficient of the ranking according to our experiments and the ranking 
according to experiments of [23] is 0.92. The differences appear only where the relative 
yields are very close. The full yield of P,4 we obtained is 15990 relations - from this 


the relative yields below can be placed in perspective. 


The left column of Table 6.4 lists the polynomials labelled 1,... ,18 in the order 
in which their yields appear from our sieving experiments. Its right sub-column gives 
the full yield relative to the benchmark P,4. So the best polynomial found by these 
primitive means has a full yield 47% better than that used to factorise RSA-130. The 
middle and right hand columns give the rankings of these polynomials by the method of 


[23] (adjusted to consider only the quintic polynomial) and the ranking E(71) (although 


as we would expect, adjusting the rankings to take into account only the non-linear 
polynomial makes little difference). The former method has correlation r = 0.53 and 


the latter r = 0.91 with the ranking revealed by the sieving experiments. Notice that 


E again isolates the very best polynomials reliably. Since the range of relative yields 
of Q1,... ,Qig is smaller than that in P;,... , Pis, we consider them a more difficult 


set of polynomials to rank well. 
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Actual Yield [23] J 
Rank cf Poly Pia Rank Rank 


1 

2 

3 

4 

5 

6 

7 

8 

9 
10 
11 


Rth 
ON nor ON 


[75088] EOS 


Table 6.4: Rankings of 18 better RSA-130 polynomials, E(F1) 


Amongst the polynomials generated during this search, we found several with coeffi- 
cients as small as those in P;,... , Pig . However, none of these polynomials have yields 
significantly better than P,4. Instead, the better polynomials here all have coefficients 
significantly larger than those of P,,... , Pią but better root properties. Polynomial 
Qı for example as x = 0.016. Table 6.5 shows the root properties of Q1,... , Qis- 

Of course the polynomials Q; are of no practical use, because their yields are poor 


compared to the polynomials we exhibit next. 


6.1.4 Good Polynomials for RSA-130 


We conducted a brief search for RSA-130 polynomials using the entirety of Procedure 


5.1.4, combined with the E ranking procedure. 
Table 6.6 shows the full yields of the best polynomials identified in this search, 


relative to the yield of Piją and according to experiments conducted as in Remark 


6.1.2. We exhibit also values of a and x for each polynomial. 
So the best polynomial identified by this method has a full yield twice that of the 
polynomial used to factorise RSA-130. 
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Table 6.5: Root properties of Q1,... , Q18 


Table 6.6: Good RSA-130 polynomials 


We performed more experiments to confirm the yield of polynomial Ry. Sieving for 
full and large prime relations using Bı = 11380951 and By = 120000000 in accordance 
with [23] revealed a total yield (the sum of full yield, 1LP- yield and 2LP- yield) 
1.83 times that of polynomial Pj4. Hence the true benefit obtained from Ry could 
be anything from a factor of approximately 1.8 to 2, depending on the particulars of 
the sieving technique. We also repeated this experiment using both the linear and 
algebraic polynomials in each case, using Bı = 3497867 and Bə = 120000000 for the 
linear polynomial in accordance with [23]. We again found that the total yield of Ry 
is 1.83 times that of P,4. 

Compare the a values of the polynomials R; to those of the polynomials Q;. The 
difference is of course due to the forced projective roots in the R; polynomials. Another 
illustration of this effect is Example 5.1.3. 

Figure 6.1 shows values of the homogeneous polynomial Ry(z,y) over a portion of 
the sieve region. The portion is —10% < z < 10% and 1 < y < 10%. The diagonal lines 


eminating from the origin are the three real roots x/y of R¡. Most of the relations of 
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course will come from the darker “valleys” carved by the real roots. 


1.5e+35 54 


Figure 6.1: Polynomial Ry 


Finally in this section we make a brief global observation by comparing the yield of 
Ri to that of a polynomial of average yield. To find a polynomial of average yield we 
chose 100 base-m representations by choosing m uniformly at random in the relevant 
range. We then took a polynomial of average yield to be the polynomial in this sample 


whose E rating was closest to the mean rating of the sample. This suffices as an 


approximation to a polynomial of average yield, and saves sieving a large sample of 
polynomials. We found 


Mavg = 12109254733486649468460. 


Sieving as in Remark 6.1.2 shows that Polynomial R, has a full yield 5.9 times that of 
the polynomial obtained from Mavg- 


6.1.5 Some Timing Considerations 


It is instructive to examine the comparitive timings of the three searches described 
here. The actual search for RSA-130 polynomials (that is, the generation of the P; 
polynomials) occupied approximately three months on each of four processors [37]. 
Generation of the polynomials Q; - using primitive means but with knowledge of what 
to look for - occupied approximately one month on one processor. Generation of the 
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polynomials R; - using Procedure 5.1.4, including forcing polynomials to have many 
projective roots modulo small p - occupied approximately 36 hours on one processor. 
Hence, even leaving non-projective roots modulo small pë to chance we obtain an 


improvement of approximately a factor of two in a trivial amount of time. 


6.2 RSA-140 


Full details of the RSA-140 factorisation are to be found in [16]. Here of course we 
comment only on the polynomial selection. Since we used both skewed and non-skewed 
searches, this is a good chance to compare the results obtained and demonstrate the 
benefit gained using Procedure 5.1.6 over Procedure 5.1.4. 

We need to be mindful now that each “polynomial” is in fact a pair of polynomials. 
In the case of non-skewed polynomials, this is not confusing because all values of 
m are similar. However, in this section and the next we compare amongst skewed 
polynomials, and compare skewed polynomials to non-skewed polynomials. We shall 


now refer explicitly to polynomial pairs when necessary to avoid ambiguity. 


6.2.1 Non-skewed Polynomials 


For the sake of comparison we include the best non-skewed polynomial pair found in 
the initial search for RSA-140 polynomials using Procedure 5.1.4. We met this pair 
earlier in Example 5.1.5. We have 
Gi(z,y) =  237866611103421300000 z” 
—514856715582822510304 zty 
—4722668925346720843884 x y? 
+6545365626333869758617 a? y? 
—3356924353646091366162 zy? 
—5142225622472630020004 y? 


G(x, y) = «x — 617119742304446938751913 y. 


with a(G,) = —4.2 and x(G1) = 0.011. When considering skewed polynomials of 
course X becomes an inappropriate quantity to consider, but we deal with this problem 
in the next subsection. 

As was the case with RSA-130, we compare the yield of the best non-skewed poly- 
nomial to that of an average non-skewed polynomial. We chose an average non-skewed 
polynomial for RSA-140 using the same procedure as described at the end of Section 
6.1.4 for RSA-130 polynomials. That gave 


Mavg = 440395459923337101533211. 


Polynomial Gr has a full yield 5.9 times that of the polynomial given by ma, ,. Notice 
that this is (perhaps coincidentally) the same improvement over the average case that 
is noted in Section 6.1.4 for RSA-130. 
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6.2.2 Skewed Polynomials - Local Considerations 


The sieving for RSA-140 was conducted using a combination of sieving techniques. We 
performed line sieving using the CWI siever, and lattice sieving using the AKL siever 
(see Section 2.1.3). The final stage of the polynomial selection process is of course to 
conduct sieving experiments on the chosen few, with the best performing polynomial 


pair in those experiments becoming the chosen one. 


Remark 6.2.1 The sieving experiments for RSA-140 were conducted at CWI using 
only the CWI siever. The rational factor base bound was 8000000, the algebraic 
factor base bound 16777215, and the large prime bounds 500000000 and 1000000000 
respectively. To obtain a reasonable profile of a polynomial pair over the entire sieve 
region in a short period of time, we used a sample of b-values across the region rather 
than every b-value in a short interval. Each pair was sieved over the same number of 
(a,b) pairs in a region skewed appropriately for that polynomial. It is possible that 
using only a sample of b-values may cause some projective behaviour modulo small p 
to be over or under emphasized during the experiment. However we do not consider 


this a major problem. 


Although the E ranking procedure works well, the value E(F’) for any given F has no 


real physical significance - it’s merely the ordering on E over a set of polynomials that 
is relevant. In this section and the next we would like to interpret physical differences, 
inasmuch as they influence yield, between the polynomials under investigation. Hence, 


we also use data obtained as a first filter on polynomials from Procedure 5.1.6. In 


Wessie NIEZ m 


which we use to compute the average size of F over its rectangle S. Recall that S has 


particular, we have 


length to width ratio s. To compare relative sizes of different polynomials we compute 


I(F,S), with the area of S invariant across the polynomials. We use 
S={(a,y) ER: -vs < z < ys and—-1//s<y< 1/73} 
and take I(F,S')/4 to be the average log size of F over S. We then use 
E(F) = I(F,S)/4 + a(F) 


as an initial and approximate rating of F. 
Notice also that construction of the E rating is similar to the ideas underpinning the 


procedure of [23] mentioned at Remark 6.1.1. We should be wary that, as a ranking 


mechanism, F is not as reliable as E. Indeed, we do not even bother to consider 
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Fy(x,y) = x — my in computing E(F). We do find however that, apart from being a 
useful and quick first filter, Æ adds to our understanding of the results. 
Table 6.7 gives relevant statistics on the top five candidates for RSA-140, according 


to experiments conducted pursuant to Remark 6.2.1. The sixth polynomial pair in the 


table, F140, we describe later. 


[Poly [Rel Yield [Av Sm] a | E JE] 


Table 6.7: Relative yields of the top RSA-140 polynomials 


Although the E rank given in the table refers to the rank revealed by E over all 


polynomials generated, the same cannot be said of the E values in the table. Several 


other polynomials (with similar values of m) had E values similar to those in the table, 


but were shown by E and sieving experiments to have inadequate yields. 


Polynomial pair A140, which we have seen before, is the one used for the factorisa- 
tion. Indeed, A140 = (Fi, Fy) with 


Fi(x,y) = 439682082840 x? 
+390315678538960 x4 y 
—7387325293892994572 x? y? 
—19027153243742988714824 a? y? 
—63441025694464617913930613 zy? 
+318553917071474350392223507494 y? 


Fy(z,y) x — 34435657809242536951779007 y 


and s = 4000. Notice that as factors as 27-3*.5-7-11-13-41-29759. Since also 4|a4 and 
2|a3, Fı (x,y) is divisible by 8 whenever y is even. F\(x,y) has at least three roots x/y 
modulo each prime from 3 to 17 (some of which are projective), and an additional 35 
such roots modulo the 18 primes from 19 to 97. 

By way of comparison to Figure 6.1 we include a similar figure, Figure 6.2 for 
this F¡. We use —44000 < z < 44000 and 1 < y < 22 to give the region displayed 
approximately the same area as that in Figure 6.1. 

As with R; for RSA-130, the most fertile sources of relations are the valleys cut by 


the real roots eminating from the origin. 
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1.5e+36 | 


Figure 6.2: Fi(x,y) for RSA-140 


We are now in a position to compare the polynomial pair Aj4g to the best non- 
skewed pair (G1, G2) above. We performed sieving experiments of the type described 
in Remark 6.2.1 on (F1, F2) and (G1, G2), using s = 1 for G1. We found the yield of 
the best skewed pair Aj4g is 1.61 times that of (G1, G2). From this we also estimate 
that A140 has a yield approximately 9.5 times that of the average non-skewed selection. 

We note that the average size of Gi is 52.27. This is significantly larger than 
that of the top five polynomials in Table 6.7, and is due to the fact that m is chosen 
significantly larger in the skewed case to force the leading coefficients of the skewed 
polynomials to be so small. Hence, particularly when looking close to the origin, 
considering only the non-linear polynomial in E favours the skewed case. 


6.2.3 Skewed Polynomials - Global Considerations 


We now compare Aj4g to a skewed selection of average yield. We generated a large 
random sample of skewed polynomials using Procedure 5.1.6 with randomised ag in 
the appropriate range and without rotations which would normally secure good non- 
projective root properties. We talk more about the distribution of random skewed 
polynomials when discussing RSA-155. For now we compare Aj49 to a particular 
average selection. We took the average selection to be the polynomial in the sample 
whose E rating was closest to the mean value. This gives the pair F140 in Table 6.7. 
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As shown in Table 6.7, we find the yield of A140 is 7.8 times that of the average skewed 
selection. 

Over the entire random sample we found the mean average size of the quintic 
polynomials F} to be 49.15 and the mean a(F1) to be —0.35. Comparing to the values 
of the top five candidates in Table 6.7 suggests that most of the benefit we are obtaining 


comes from root properties rather than size. 


6.2.4 Some Timing Considerations 


It is estimated that the time spent searching for RSA-140 polynomials, including an 
initial search for non-skewed polynomials and the developmental phase of the skewed 
polynomial search, is approximately equivalent to 2000 CPU hours on one 250 MHz 
Origin 2000 processor. This is very approximately equivalent to 60 MIPS-years. Given 
that the sieving time was approximately 2000 MIPS-years, we arrive at the question of 
whether it would have been worthwhile to continue searching for polynomials rather 
than start sieving. 

In the case of RSA-140 there were pragmatic considerations which made it ap- 
propriate to stop the polynomial search when we did. We wanted to use increased 
idle time on workstations over the Christmas period for sieving, so we stopped the 
polynomial search just before Christmas 1998. 

However the question remains. To consider this question we use the larger sample 


of polynomials examined during the RSA-155 polynomial search. 


6.3 RSA-155 


First we give the necessary local considerations by examining the top few candidates 


and their properties, and the performance of the E ranking. We then move to global 


considerations. We consider a large sample of randomly generated skewed polynomials 
and compare the yield of the pair being used for the factorisation to that of a pair from 
the random sample with average yield. We then turn to the random sample as a whole, 
and compare its distribution to that of the sample of polynomials generated during the 
search. Finally, we use the sample of generated polynomials and some approximations 


to consider the trade-off between polynomial search time and sieving time. 


6.3.1 Local Considerations 


As in the RSA-140 factorisation, sieving for RSA-155 was conducted using both the 
AKL and CWI sievers. A large portion of the RSA-140 relations (55%) were generated 
using the AKL siever. Final statistics are not yet available, but we expect that portion 
to be larger for RSA-155 (new contributors of sieving machines are using the AKL 
siever). Hence, sieving experiments on the top few RSA-155 candidate polynomials 


were conducted on both sievers. 
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Remark 6.3.1 RSA-155 sieving experiments using the CWI siever were again con- 
ducted at CWI in the manner decribed in Remark 6.2.1. Experiments on the AKL 
siever were run by Arjen Lenstra as follows. For each polynomial pair sieving was con- 
ducted over each special q in the ranges i- 107 < q < i- 10” +500 for i = 2,3,... ,12. 
Since the precise number of special q per polynomial is variable, we use the number 
of relations obtained per special q as the yield measure in these experiments. For 
implementation specific reasons, the smoothness bounds used on the AKL siever were 
slightly different to those used on the CWI siever. On the AKL siever, the rational 
factor base bound was 3497867, the algebraic factor base bound 12174433. The large 


prime bounds were the same as in Remark 6.2.1. 


We note that in addition to actual yield, time per relation is a relevant quantity 
to compare between polynomials. For polynomials whose yields are very close, the 
average time per relation may well determine which polynomial is used. This situation 
has not yet arisen in practice. Moreover, empirically determined time per relation 
figures can be unreliable since they depend heavily on the load, memory and cache 
properties of individual machines. Hence we report here only the yield figures. 

Table 6.8 gives statistics on the top eight candidates for RSA-155. They are listed 
in the order revealed by sieving experiments with the AKL siever. Polynomial pair 


F155 is referred to later. 


Pol. | Rel. Yield | Re. vit NISKIEJ 


1 
2 
3 
4 
5 
6 
8 
7 


Table 6.8: Relative yields of the top RSA-155 polynomials 


Tests with the CWI siever place C155 higher and EF455 lower than with the AKL 


siever. There is a strong correlation between the ranking revealed by E and that of 


the sieving experiments, particularly with the AKL siever. 
As was the case with RSA-140, we comment that the ranking of E values, though 


informative, may be misleading. Several other polynomials had E ratings as good as 
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the polynomials in the table (with similar m values), but were shown by E and sieving 
experiments to have lesser yields. 


The pair A155, being used for the factorisation, is 


Fi(z,y) = 119377138320 2° 
—80168937284997582 x4 y 
—66269852234118574445 23 y? 
+11816848430079521880356852 a? y? 
+7459661580071786443919743056 zy? 
—40679843542362159361913708405064 y? 


Fə(x,y) = «—39123079721168000771313449081 y 


with s = 10800. We have as = 24 . 3? . 5- 11? -19-41 - 1759. Also, F¡(x,y) has 21 
roots z/y modulo the six primes from 3 to 17 (some of which are projective), and 
an additional 34 roots modulo the 18 primes from 19 to 97. Notice that Fı has root 
properties just as good as the other polynomials in the table. Compared to the other 
polynomials in the table however, F; has unusually small average size. This is a nice 


example of root properties and size combining to produce our best polynomials. 


6.3.2 Global Considerations 


A sample of 10000 random skewed polynomials was generated for RSA-155 using the 
same procedure as for RSA-140. We refer to this sample as the random sample. We 
again chose an average skewed polynomial to be one whose E rating is closest to the 
mean of the random sample. This gives the pair F155 of Table 6.8. We find the yield 
of Aj55 is 13.5 times that of F155. Comparing this to the figure of 7.8 for the RSA- 
140 selection we find that the RSA-155 selection is about 1.7 times better, relatively 
speaking, than the RSA-140 selection. 

Over the entire random sample we found the mean average size to be 55.4 and the 
mean a to be —0.1. Comparing to the values of the top eight candidates in Table 6.8 
again suggests that most of the benefit is coming from root properties, although we do 
have more benefit from size here than was the case for RSA-140. 

We generated a large sample of candidate polynomials during the RSA-155 search. 
As a first filter, we accepted polynomials for which E(F,) < 47.0. We found 8200 such 
polynomials, and these form the generated sample. Relative yield is the best measure 
of the value of the generated sample, but another useful measure is the frequency with 
which good polynomials occur compared to the random sample. 

We examine this by comparing the distribution of the generated polynomials to 
that of the random sample. Actually we examine the distribution of E values of these 
polynomials, because the E measure is sufficiently quick to compute for large samples, 
and it has some physical significance. Hence the term distribution of good polynomials 


refers to the frequency distribution of E(F1) over the relevant sample. 
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Figure 6.3 shows distributions of the random and generated samples. 


frequency 
o 
u 
T 


o 
R 
T 


0.2|- 


0.1 


E(F,) 


Figure 6.3: Distribution of random and generated polynomials for RSA-155, see Re- 
mark 6.3.2 


The horizontal axis gives E ratings; better polynomials have lower E ratings and 
occur towards the right hand side. The vertical axis gives the frequency at which each 
rating occurs, relative to the modal rating. The leftmost peak is the random sample. 
We are interested in the right hand tail of this curve. The smallest rating found in the 
random sample of 10000 polynomials is E = 49.2. The largest rating considered in the 
generated sample is 47.0, so the generated sample lies entirely within the unobservable 
tail of the random sample. 


The rightmost peak is the generated sample. 


Remark 6.3.2 The frequencies of the generated sample have been renormalised to 


the value at E = 47.0, so that we may see them. 


As we saw in Table 6.8, the best polynomials have approximately E < 46.0. These 
polynomials lie in the unobservable tail of the generated sample, and hence in the 
unobservable tail of the unobservable tail of the random sample. 

For the sake of completeness we include the analogous figure for RSA-140 polyno- 
mials (Figure 6.4). The random sample here contains 5700 polynomials. The generated 
sample contains far fewer polynomials (400), and is not neatly distributed. Again this 
is because the search procedure was under development during the RSA-140 search. 
However, it is useful to note that the random sample is distributed similarly to that 


of RSA-155, and the generated sample lies well into the tail of the random sample. 


106 Chapter 6: Polynomials for RSA Factorisations 


frequency 


45 44 43 42 41 40 


Figure 6.4: Distribution of random and generated polynomials for RSA-140 


Let us return to the RSA-155 figure and quantify the extent of the unobservability. 
We denote by ur(E) the relative frequency in the random sample of polynomials of 
rating E. 


Remark 6.3.3 The distribution shown in Figure 6.3 actually counts E values in in- 
tervals of length 0.1. So, formally we regard (E) as being the relative frequency of 
polynomials F, with rating E — 0.05 < E(F\) < E + 0.05. 


The next step is to fit a curve to the (E) distribution. Using least squares regression 


to fit a polynomial to log (E) we found the best fit using 
pr (E) = expla + bE + cE”) 
with 
a = —1258, b= 45.8, c= —0.417. 


Figure 6.5 shows the fit of this curve with the random sample. 


The quantity 


v,(E;, E2) = = exp ((E1 — £2)(b + c(EHq + £2))) 


gives the frequency at which polynomials of rating Ej appear compared to those with 
rating Eo. Table 6.9 shows this quantity at some interesting points on the curves in 


Figure 6.3 (the modal rating in the random sample is 54.6). 
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Figure 6.5: Actual and fitted distributions for random sample 


Remark 6.3.4 It might be considered useful to examine the cumulative frequency of 
E ratings. That is, asking “what is the relative frequency of polynomials with ratings 
at least as good as E?”. However since the frequency decreases so quickly as a function 
of E, the relative frequency itself becomes the determining factor. We content ourselves 


with examining just the relative frequency in the context of Remark 6.3.3. 


Hence, our best few polynomials occur approximately 1018 times more rarely than 
average polynomials. Moreover our best few polynomials occur more than one million 
times less often than the cut-off polynomials. If we had been searching at random 
then finding our cut-off polynomials, let alone our best few, would have been out of 


the question. 


6.3.3 Some Timing Considerations 


Fortunately, we do not search at random. Our search procedure is of course biased 
towards finding good polynomials. We now focus on the distribution of the generated 
sample rather than the random sample, to deal with the question “for how long should 
we search?”. 

Similarly to (4) above, denote by uy(E) the relative frequency in the generated 
sample of polynomials with rating E. Using least squares regression to fit a polynomial 
to log ug(E) we found the best fit using 


Hg(E) = exp(a + bE) 
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Table 6.9: Relative frequencies of good polynomials 


with 
a = —278.5, b = 5.92. 


Notice that we obtain a linear exponential for the generated sample as opposed to a 


quadratic exponential for the random sample. Figure 6.6 shows the fit. 
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Figure 6.6: Actual and fitted distributions for generated sample 


Let 


Lg(F1) 
fig (Ez) 


be the relative frequency at which polynomials with rating KE; appear compared to 


Vg(E1, Eo) = = exp{b( E: = E2)} 


those with rating Ez, in the generated sample. We use vg(E1, E2) below to estimate 
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the E ratings of polynomials we might expect to obtain from a given amount of search 
effort. 

We will combine such an estimate with an approximation of the expected change 
in yield. To quantify the expected change in yield as a function of E we extrapolate 
crudely from pairs A155 to Hi55 in Table 6.8. Assume that, at least locally, yield 
changes approximately linearly with E. Clearly this is not true, yield does not even 
change monotonically with E, but this should suffice to give a rule-of-thumb approx- 
imation. Using the yield figures from the AKL siever, we find that every decrease of 
0.1 in E corresponds crudely to an increase of 1.2% in yield. Notice that this is the 
same approximation obtained from averaging between A155 and Fi55. 

Final statistics on the actual sieving time for RSA-155 are not yet available. We 
suggest a reasonable advance estimate is 8000 MIPS-years. This is derived using 
the L-function to extrapolate from the RSA-140 sieving time (see Section 2.1.4), and 
factoring in the better polynomial selection for RSA-155 relative to RSA-140. So the 
estimate is 2000 - 7 - 7.8/13.5 ~ 8000 MIPS-years. 

That is, every 1% improvement in polynomial yield saves 80 MIPS-years in sieving 
time. Table 6.10 shows the expected benefit obtained from k times the search effort we 
actually invested, for some useful x. The second column uses vy(EF1, Eo) to estimate 
the expected change in E as a result of the k-altered search effort, the third column 
uses the above rule-of-thumb to estimate the corresponding change in yield, compared 
to A155. The final two columns give the expected change in polynomial search time 


and the expected change in sieving time, respectively, in MIPS-years. 


Search Sieve 
Time (MY) | Time (MY) 


Table 6.10: Costs and benefits of polynomial search time 


The point to stop searching for polynomials is the point at which the marginal cost 
exceeds the marginal benefit. That is, at approximately twice the effort we invested 
for the RSA-155 search. We used approximately twelve machines for the RSA-155 
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search. Given that we have well over 200 machines available for sieving, it would be 
no practical difficulty to use, say, 25 machines for the polynomial search over the same 
period of time as we used for the actual search. 

We caution against over-reliance on the actual figures in Table 6.10. We suspect 
[lg is overly pessimistic, and of course the rule-of-thumb for yield as a function of E is 
only approximate. Still, it does seem reasonable to conclude that, despite the benefit 
not being great in absolute terms, it could have been worthwhile using up to twice the 


effort invested in the RSA-155 polynomial search. 


6.4 Summary 


In this chapter we have examined two new factorisation records, RSA-140 and RSA- 
155, and one old one, RSA-130. 


6.4.1 RSA-130 


We re-examined the polynomial selection task for RSA-130 as a means of testing Pro- 


cedure 5.1.4 and the E rating. After testing on several sets of polynomials we conclude 


that E gives a reliable pre-sieving ranking of yield. 


Using the E rating and Procedure 5.1.4 we found, in a tiny fraction of the time spent 


on the actual RSA-130 polynomial search, several significantly better polynomials. Our 
best RSA-130 polynomial has a full yield twice that of the polynomial used for the 
factorisation, and approximately 5.9 times that of a non-skewed polynomial of average 
yield. 

In essence, the RSA-130 results begin to demonstrate the benefit of knowing “what 


to look for”. 


6.4.2 RSA-140 


The RSA-140 and RSA-155 results demonstrate the benefit of also knowing “how to 
look for it”. 

The RSA-140 factorisation is the first major test of Procedure 5.1.6. Our best 
polynomial pair, used for the factorisation, has a full yield close to eight times that 
of a skewed pair of average yield. Approximately a factor of four in that eight comes 
from root properties, approximately a factor of two from size. Better polynomials 
could have been obtained, but the search was truncated for practical reasons. 

For comparison, we also searched initially for non-skewed polynomials using Pro- 
cedure 5.1.4. The best non-skewed polynomial found has a full yield approximately 


5.9 times that of an average non-skewed polynomial. 
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6.4.3 RSA-155 


Locally, the RSA-155 results exemplify the benefit obtain by finding polynomials with 
good combinations of size and root properties. The best polynomial pair so found has 
a full yield approximately 13.5 times better than an average skewed selection. 

Globally, we find that such polynomials occur approximately 1018 times less often 
in a random skewed sample, than average polynomials. The sample of polynomials 
generated during the search however, is much more favourably distributed. Using this 
distribution and some further approximations, we estimate it may have been beneficial 
to invest up to twice the effort that went into polynomial selection for RSA-155. The 
expected gain in yield over the polynomial pair used is not great, but it does exceed 
the cost of obtaining it. 

In any event, it is reasonable to conclude that for large RSA factorisations our 
methods are able to find polynomial pairs whose yields are 10-15 times greater than 
the average selection. This makes the sieving task for factorisation of 512 bit RSA 


moduli entirely do-able with a small collection of machines. Indeed, it has been done. 
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Chapter 7 


Conclusions and Further Work 


In this chapter we summarise the conclusions of this thesis, and suggest some areas for 


further research. 


7.1 Conclusions 


Detailed conclusions are given at the end of Chapters 3-6. In this section we merely 


summarise what is said there. 


e Good number field sieve polynomials are polynomials which have good yield. 
Yield can be adequately accounted for by combining measures of size and root 


properties. 


e Once yield is correctly accounted for, polynomials with good yield must be found. 
We improve on previous efforts by introducing new techniques for finding base-m 
polynomials with good combinations of size and root properties. These tech- 
niques work best, particularly with regard to non-projective root properties, 
when the non-linear polynomial is highly skewed. Under conditions experienced 
in large RSA factorisations, we are able to exploit root properties alone to in- 


crease yield by up to a factor of four. 


e Using our techniques for N in the current range of interest it is cost effective 
to find polynomials with yields 10-15 times better than a random selection. 
We factorised RSA-140 using a polynomial which is almost that good. We are 
factorising RSA-155 using a polynomial which is that good. 


e 512 bit RSA moduli are demonstrably insecure. 


7.2 Further Work 


We suggest the following areas of further research. 
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e There may be implementation specific reasons for users to prefer non-skewed 


polynomials. It should be possible to introduce a sieve-like procedure to identify 
non-skewed F with good non-projective root properties. This may not be as 
succesful as for highly skewed F}, since we do not have the freedom of inspecting 
many rotations for each possible Fi. Even if we can find them, we wouldn't 
expect non-skewed polynomials with excellent non-projective root properties to 


have significantly better yields than the skewed polynomials we already find. 


As we move to higher degree F,, higher degree rotations could be considered. 


Quadratic rotations would be appropriate for sextic F}. 


e We might consider applying our improvements to the selection of base-m 1, mo 


polynomials (Remark 2.3.2). We might also consider extensions of Montgomery’s 
Two Quadratics method (Section 2.3.1) and novel methods for new degree pairs 
(Remark 3.1.1). 


Since we now have a better understanding of the generation of good polynomials 
it may be time to reconsider multiple polynomial versions of the number field 
sieve. When using several non-linear polynomials, the proximity of regions of 
maximal yield (usually, real roots) should be considered (see Remarks 2.3.1 and 
4.2.1). 


One consequence of using rotations in polynomial selection is that several base- 
m polynomials Fi can be found with the same common root m. It could be 
worthwhile to consider using several such polynomials Fi ¿ and seeking relations 
between each Fy j and fy. Perhaps with sufficiently many good Fi j, only the 
regions of maximal yield need be considered. The obvious disadvantage of such a 


scheme is that the matrix size increases linearly with the number of polynomials. 


e We should apply our techniques to discrete logarithm number field sieve compu- 


tations. 


Having at least partially addressed the polynomial selection problem, it now 
becomes even more crucial that we improve the matrix reduction step. This 
is also relevant to discrete logarithm computations. The promising avenues for 
improvement are better filtering strategies and parallelisation of the reduction 


code. 


Factorisation of smaller RSA moduli, like RSA-150, would be useful to give 
a more complete picture of the growth of actual factorisation effort with N. 
Factorisation of larger RSA moduli, as well as being useful, would be exciting. 


Hence, we should factorise more RSA moduli. 


Appendix A 


Appendix to Chapter 4 


Polynomials A,... ,K are listed below. The values of m given are m € Z for which 
f(m) = 0 mod N. The values of N are C106 for polynomials A,... , E and polynomials 
H and I, C105 for polynomials F and G, and C107 for polynomials J and K. 


Polynomial A: 


10642297120196616201018579748198464994687+ 
157168918105124331525011637x — 3233795959001? 


m = 311811767144256795964392770799295468577727849287441 \ 
417195888224875673003757757525998997 704760967662422630 


Polynomial B: 


—58535465962950604788770735849031669686845+ 
578123152107916050639034324x + 6609400918712? 


m = 111266350151832591590373321222840072472133768682060\ 
5812518391957850167078163045569883641392384840611818322 


Polynomial C: 


—80444723076532128931843884067440931877697+ 
671898769354767184209613115x + 8765418000017? 


m = 6443859452384122994500977267722987304295218374074261 
656132710287589175267555416671359532826085727240133210 
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Polynomial D: 


—45601329349014245961324468559468003125143+ 
405863886956809889611012220x + 875883403741? 


m = 5702215788965246050727641 462292863785 1608638531004\ 
75130134193815270889121055847249796937966903736891 78237 


Polynomial £: 


—43070512279968963999727149653384015128406 
—140644997594088206014438353x + 2741743647272? 


m = 214313853594616324909851890417913850175745088890451 
6629204834574379795020566498337694386071915713661516800 


Polynomial F: 


540759062604782971357139536186424874771+ 
86817069333519465483641612x + 3429105277372? 


m = 229143590555869469062115013538557681923164235754261 
6217765793563500275674926893987223245481401160544005942 


Polynomial G: 


129128767300065233631168229536267982420800 
—913049273181768816962553218a + 1242060255079x? 


m = 22914359055586946906211501353855768192316423575426\ 
621776579356350027567492689398 722324548 1401160544005942 


Polynomial H: 


—32430287560495976143910317159823376255144 
—101643163734436736066960294x + 1900304761132? 


m = 17900441287572625768481534121337659378990978888143\ 
77815816769105476827696665209945565825606429787588581699 


Polynomial J: 


164086080001456034179238766543256687713827 
—401968646051742270344280172x — 785083260639? 


m = 179004412875726257684815341213376593789909788881431 
77815816769105476827696665209945565825606429787588581699 


Polynomial J: 


—311653994359418670319775330136434513506986+ 
763119703166287854853198889x — 2417995148052? 


m = 12637530599467776761853128412624277137347729851839\ 
924048392287605249253270797264409813230653725405155484892 


Polynomial K: 


—46786964108579179806101863478910720071558+ 
—4257042830287142537792693152 — 5401617762832? 


m = 12637530599467776761853128412624277137347729851839\ 
924048392287605249253270797264409813230653725405155484892 
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Appendix B 


Appendix to Chapter 6 


Below are polynomials referred to in Chapter 6. Non-skewed polynomials are defined 
uniquely by m (see Section 2.3), so we give only m for these polynomials. For skewed 


polynomials, we give Fi, F» and the skewness s. 


B.1 RSA-130 Polynomials 


Table B.1 gives values of m for polynomials Ry, i = 1,... ,5. 


12429620102099690356862 
12429620102099690356861 


12429620102099690356863 
13451029676646753000757 
12400786914908592973618 
12664454168907537623814 


Table B.1: Values of m for R; 


Table B.2 contains the values of m for polynomials P;,... , Piz (provided by Arjen 
Lenstra) and Q1,... , Qis. 
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1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 


EEEE 
TAE W NM 


10519776768693341771145 
12112464325781598662255 
12175183789358781924382 
12922982589397980905651 
10056778742160802578928 
12893568754859383127665 
13239320351370744041131 
12506435569527239916746 
12666132133378233425814 
108443468170528744 70999 
13139341559800540682218 
12857394860965184611325 
11856745579968929283390 
12574411168418005980468 
11507478393662235457656 


Ro 
A SS SI HH 


http 
CON MD OK W N 
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13892376347633905755115 
12453346471414472759941 
12189668945503746069685 
11837358189073863960965 
14227836633450858685725 
12846317334855496412374 
12485318267855789022719 
13664023713239125138661 
14262547698921937056113 
13755004021960592085464 
14214149085376118983291 
15151852662623823374781 
14185394352093247029946 
12351139031991610954191 
14601881988167170300659 
13603479675779569518553 
13809622636367237837331 
12464197256082744853511 


Table B.2: Values of m for P; and Q; 


B.2 RSA-140 Polynomials 


A140: 


Fi(z,y) = 


439682082840 x° 


+390315678538960 2* y 
—7387325293892994572 x? y? 
—19027153243742988714824 a? y? 
—63441025694464617913930613 zy” 
+318553917071474350392223507494 y? 


F(x, y) = 


4096 


x — 34435657809242536951779007 y 


B.2 RSA-140 Polynomials 


By40: 


Ciao: 


Diao: 


Eiao: 


Fi(x,y) = 


F(z,y) = 


475678803600 z3 
+12310512454193580 zty 
—47195522868281245622 x? y? 
—18875374477888317230356 x? y? 
+708592905109171282725988833 xy! 
—762378574872525817932463490775 y? 


x — 33897945514869272070938702 y 


3680 


473378805900 q? 
+6786847212725992 x4 y 
—107779980090539302193 x3 y? 
—326018199250839587813647 x? y? 
+2303400508103580132807667310 zy? 
—1306686150190334964106092161208 y? 


x — 59773850015391247110492107 y 


6200 


569366998200 x” 
+27579278413218810 x+y 
—57999837293490323001 z? y? 
—494560012317526613653093 x? y’ 
+1118023044742014236005014576 zy? 
—98133850888651599883245735012 y? 


x — 37563294757862265713468083 y 

4119 

54960260355 x° 
+97578919634740 x* y 
—3693662646946497286 z? y? 
—19027153243742988714824 a? y? 
—126882051388929235827861226 zy? 
+1274215668285897401568894029976 y? 
x — 68871315618485073903558014 y 


7360 
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F140: 


Fi(a,y) = 9187603793796 2” 
+12386461804765297 «1 y 
+469987288306604686609 z? y? 
—889049056208116896399 x? y? 
+13987441268371968500500939 zy? 
—296157023846942188952843 y? 


Fy(x,y) = x -— 18749758811416934921816359 y 


s = 300 


B.3 RSA-155 Polynomials 
Aiss: 


119377138320 x° 
—80168937284997582 zty 
—66269852234118574445 x? y? 
+11816848430079521880356852 x? y? 
+7459661580071786443919743056 zy? 
—40679843542362159361913708405064 y? 


Fi(x, y) 


Fə(x,y) = «—39123079721168000771313449081 y 


s = 10770 


B155: 


Fi(a,y) = 9734331382020 2° 
+186548816004600576 zty 
—2621958757709806297705 x3 y? 
—11937100897656690036171818 xz? y? 
+686 14407568250792529987183215 ay? 
+72327510316160055608800665174636 y? 


Fə(x,y) = {x — 18636400766678583399319133866 y 


s = 6354 


B.3 RSA-155 Polynomials 


C155: 


Di55: 


E155: 


Fiss: 


Fi(x, y) 


F(z,y) 


1290313469760 q? 


+265878007916683818 x* y 
—798398403873787965715 x? y? 
—69139199782500140838174030 x? y? 


+3815186588269061 1373838275800 zy? 


x — 24304026003277429995755551440 y 


13103 


13773893580720 x5 
+293273156850908000 x* y 
—1262097631040259345842 x3 y? 
—7076664823854260804438715 a? y? 
+15958633172059160822941381252 ay? 
—60529194441853543661902699832835 y? 


x — 15135818898777675588099420298 y 
5430 

2697367246860 q? 
+105115408896978962 zty 
+3195116446280929272587 x? y? 
—24471071308994536760102448 z? y? 
—410312201224383538645857505823 zy”? 
+29876530689458684852162460785950 y? 
x — 27672044645620813150112356926 y 


11687 


13773893580720 q? 


—23870742845170000 z*y 
—3743293864033106325842 x? y? 


—2120792922120149563797447040262880 y? 


-34223438393917535042668495 a? y 
-219972273530847363657409036632 zy? 


x — 15135818898777675588099424903 y 


10860 


-3010771538176510065263473897069897 y? 
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Gil 


Fi(x,y) = 10087787167920 1? 
—245018020007667129 zty 
—3601309509661665837217 x? y? 
+11986906038045125769762173 x7 y? 
+252533169139973859211877889668 zy”? 
+696041419085277901469636365164252 y? 


Fə(x,y) = x — 16108610671279075074032260691 y 


s = 9741 
Hiss: 


Fi(x,y) = 8648697934800 «° 
+800666437942682720 zty 
—3757414786445679414797 x3 y? 
—114830979471303981563343633 x? y% 
+96691565654522380316377089613 zy”? 
—208106710060120910136340598900223 y? 


Fo(z,y) = x — 16612198869532345422993004840 y 
s = 10941 


Fi55: 


Fi(z,y) = 453631755 1% 

-7707423309885 x4 y 
+574980043913676918502317 x3 y? 
+143867958120855464712054 a? y? 
—553727331155303326804091804756 zy? 
+40416462652580845860972043137 y? 


Fo(x,y) = x —119254994773154196771315073786 y 


s = 1080 
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